TensorRT, vLLM, Triton, FasterTransformer | ONNX, GGUF, quantization (FP16, INT8, FP8) | NCCL, MPI, InfiniBand, RDMA, and multi-node GPU workloads | nvidia-smi, Nsight, nvprof, TensorRT Profiler | Kubernetes (K8s), OpenShift, GPU scheduling (Kubeflow, Ray, KServe) | Python, Scala, and SQL | Linux
- Bachelor or Equivalent Degree
- 7+ years total engineering or operational experience
- At least 5+ years of relevant experience in a similar role
- Experience within large and complex global enterprises defined by high availability, transaction rates, and geographical distribution
Skills Required
- Bachelor or equivalent degree
- 7+ years total engineering or operational experience
- At least 5+ years of relevant experience in a similar role
- Experience within large and complex global enterprises (high availability, high transaction rates, geographical distribution)
- Experience with TensorRT, vLLM, Triton, FasterTransformer, ONNX, GGUF, and model quantization (FP16, INT8, FP8)
- Experience with NCCL, MPI, InfiniBand, RDMA, and multi-node GPU workloads
- Experience with GPU profiling tools: nvidia-smi, Nsight, nvprof, TensorRT Profiler
- Experience with Kubernetes/OpenShift and GPU scheduling/orchestration tools (Kubeflow, Ray, KServe)
- Proficiency in Python, Scala, SQL, and Linux
What We Do
ENFINT is a forward-thinking software development company dedicated to crafting cutting-edge AI-powered solutions specifically designed for financial institutions. Leveraging expertise in banking, financial technologies, IT, and AI, the company develops innovative solutions—such as the Flametree.ai platform—that enhance labor productivity by supplementing or replacing manual tasks with AI-driven capabilities to optimize operations and drive sustainable growth.








