Maximum of 25 job preferences reached.
Top AI & Machine Learning Jobs
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Lead and grow an engineering team building serverless inference infrastructure and APIs for large-scale AI workloads. Drive architecture, throughput and GPU utilization optimizations, partner across platform and product teams, own operational readiness and incident response, and deliver highly available, production-grade multi-tenant systems.
Top Skills:
Api GatewayCloud-Native Multi-Region ArchitecturesGpuKubernetesLlmsMicroservicesObservabilityService MeshSreTensorrt-LlmTraffic RoutingTritonVllm
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Lead and grow an engineering team to design, build, and scale a production LLM inference platform. Drive architecture, scheduling, GPU utilization, orchestration, observability, security, and cross-functional delivery for large Kubernetes clusters and multi-tenant AI workloads.
Top Skills:
Amd)CriuCudaGpu (NvidiaGvisorHamiKai-SchedulerKata ContainersKubernetesMicrovmsNumaNvidia Cuda-CheckpointNvidia GroveNvlinkOci Image VolumesPcieRocmSglangVllm
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Lead and grow a Forward Deployed Engineering team focused on internal data and AI platform pipelines. Drive customer engagements, validate and optimize AI workloads, influence product roadmap, build automation and benchmarking, collaborate with partners, and support production deployments and platform scale.
Top Skills:
CrewaiCudaGoKubernetesLanggraphLlamaindexLlm-DMcpNcclNvidia DynamoOpenai Agents SdkPythonRay ServeRcclRocmSglangTensorrtTritonVllm
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Design, build, and operate scalable, multi-tenant serverless inference services and APIs. Improve throughput, GPU utilization, reliability, and observability for large-scale AI workloads. Collaborate with platform, GPU infrastructure, and product teams, participate in on-call rotations, and drive architecture, automation, and incident reduction.
Top Skills:
GoKubernetes
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Lead and grow the Inference Orchestration engineering team to design, build, and operate Kubernetes-based AI infrastructure at scale. Drive scheduling, GPU utilization, topology-aware placement, checkpoint/restore for long jobs, fault tolerance, model distribution, security isolation, and cross-functional delivery to meet performance, cost, and reliability goals.
Top Skills:
Amd GpusCriuGvisorHamiKai-SchedulerKata ContainersKubernetesMicrovmsNumaNvidia Cuda-CheckpointNvidia GpusNvidia GroveNvlinkOci Image VolumesPcieSglangTritonVllm
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Technical leader designing and implementing massive-scale, stateful multi-agent multi-turn simulation and evaluation systems. Build persona synthesis pipelines, what-if benchmarking frameworks, durable workflow orchestration, and high-performance APIs while integrating LLMs and agentic architectures. Drive architecture, mentor engineers, lead cross-functional strategy, and ensure scalability, reliability, and observability for AI feedback and evaluation infrastructure.
Top Skills:
AutogenCrewaiGoGrpcLangchainLlm OrchestrationLlmsMessage/Event-Driven ArchitecturesPythonStreaming Llm Token HandlingWorkflow Orchestration Engines
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Drive the design and operation of the Gradient AI platform, focusing on architecture, technical excellence, and innovation. Collaborate across teams, mentor engineers, and lead initiatives for scalability, performance, and reliability in AI/ML development.
Top Skills:
Agent-Development TechnologiesAi/Ml PlatformsCi/Cd PipelinesCloud ApplicationsGenaiIac
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
Lead and grow an AI engineering team to build internal copilots, agents, and an AI platform. Act as a player-coach contributing to architecture, prototypes, and code. Define technical roadmap, partner with business leaders to ship AI-native workflows, and ensure governance, observability, evaluation, and cost controls for LLM-driven systems.
Top Skills:
AgentsAutogenClaudeClaude CodeCrewaiCursorEvaluation HarnessesGithub CopilotGoGreenhouseGrpcJavaKubernetesLanggraphLlmopsLlmsMcp GatewayModel Context Protocol (Mcp)NetSuiteObservability StacksPythonRetrieval-Augmented Generation (Rag)SalesforceServerlessTypescriptVector StoresWorkday
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
The Staff Forward Deployed Engineer drives AI adoption for strategic customers by solving complex cloud infrastructure challenges, developing scalable assets, and influencing product roadmaps through collaboration and technical expertise.
Top Skills:
CrewaiCudaGoGpuKubernetesLanggraphLlamaindexOpenai TritonPulumiPythonRocmTensorrtTerraform
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
The Staff Forward Deployed Engineer will work with AI-Native customers to solve cloud infrastructure challenges, build scalable tools, optimize AI workloads, and influence product development by embedding with clients and establishing best practices.
Top Skills:
CudaGoKubernetesOpenai TritonPulumiPythonRocmTensorrtTerraform
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top Companies Hiring AI & Machine Learning Roles
See AllAll Filters
Total selected ()
No Results
No Results


