Zaimler

Machine Learning Engineer, ML Platform

Reposted 3 Hours Ago

Be an Early Applicant

San Mateo, CA, USA

In-Office

Senior level

Artificial Intelligence • Big Data • Machine Learning • Software

The Role

The role involves developing scalable ML algorithms, optimizing LLM inference performance, managing model serving pipelines, and collaborating with data engineers for efficient knowledge graph generation.

Summary Generated by Built In

About zaimler

AI agents can't reason over data they don't understand. Enterprise data today is fragmented across dozens of systems with no shared context, meaning, or structure, and that's why most enterprise AI is failing. The shift from copilots to autonomous agents is creating an entirely new infrastructure layer, and we're building it.

zaimler is the context infrastructure for the agentic era: a platform that automatically discovers domain knowledge, maps relationships, and gives AI agents the semantic understanding to operate with precision at scale. Imagine knowledge graphs that support real-time inference, built for systems that need to reason, not just retrieve.

zaimler was founded by Biswajit Das (ex-VP Engineering, Truera), a Data Infra veteran and former Chief Architect at Visa, and Sofus Macskassy (ex-Director of Engineering, LinkedIn), who built one of the largest knowledge graphs in production in the industry at LinkedIn. We're growing and deploying with major enterprises across insurance, travel, and technology. If you want to build infrastructure that the next decade of enterprise AI runs on, we'd love to talk.

About the Role

You’ll join our ML team focused on turning raw enterprise data into structured, contextualized knowledge graphs and embeddings. You’ll develop novel and highly scalable algorithms for ML and data engineering to make our overall system more efficient, experiment with new approaches for distilling large models into smaller, more efficient ones; improve retrieval, ranking, and reasoning performance through feedback loops; and prototype methods that help LLMs extract and act on real-world knowledge.

We're looking for someone who thrives on iteration, cares about building with rigor, and is hungry to learn from some of the best engineers and researchers in the field.

What You’ll Be Doing

Build and maintain training infrastructure, feature stores, and model serving pipelines
Optimize LLM inference performance — compute efficiency, memory management, latency, and throughput
Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++)
Deploy and manage models at scale using tools like vLLM and Baseten
Architect scalable pipelines for model training and serving across GPU infrastructure
Collaborate with ML and data engineers to ensure the platform meets research and production needs

Prior Experience

PhD in CS, ML, or a related field or MS with 4+ years of relevant industry experience
Background in LLM optimization: inference efficiency, quantization, memory layout, or serving performance
Ability to read, navigate, and debug LLM source code and underlying runtime libraries
Comfortable in Rust and/or C++ at the systems level; strong Python required
Strong algorithmic fundamentals — data structures, complexity, distributed systems
Hands-on experience with model serving infrastructure (vLLM, Baseten, Triton, or similar)
Experience setting up and scaling ML pipelines end-to-end

Nice to Have

Familiarity with feature store design and management
Experience with GPU cluster management and optimization
Contributions to open-source ML infrastructure or LLM tooling
Experience with Ray, ONNX, TensorRT, or similar optimization and serving frameworks
Understanding of transformer internals and attention mechanisms at the implementation level

Why Join

A rare chance to be a founding engineer shaping both company and product direction.
Competitive salary, benefits, and meaningful equity.
Work alongside engineers and researchers from LinkedIn, Visa, Meta, and Branch.
Onsite culture in San Mateo, designed for deep collaboration and high-velocity building.
Full benefits package (Medical, Dental, Vision, 401k).
We sponsor H-1B visas and assist with immigration processes.

We value builders over résumés. If this role excites you but you don't check every box, we still want to hear from you. zaimler is an equal opportunity employer.

Skills Required

PhD in CS, ML, or related field OR MS with 4+ years of relevant experience
Background in LLM optimization: inference efficiency, quantization, memory layout
Ability to read, navigate, and debug LLM source code and underlying runtime libraries
Comfortable in Rust and/or C++ at the systems level
Strong Python skills
Strong algorithmic fundamentals: data structures, complexity, distributed systems
Hands-on experience with model serving infrastructure (vLLM, Baseten, Triton, etc.)
Experience setting up and scaling ML pipelines end-to-end

View all jobs at Zaimler

View Zaimler Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

12 Employees

What We Do

Zaimler is an AI infrastructure company that provides a platform for enterprise AI agents. It focuses on discovering domain knowledge, mapping relationships, and providing semantic understanding to enable autonomous agents to reason over fragmented enterprise data. Founded by industry veterans, the company aims to build the infrastructure layer for the agentic era, supporting real-time inference and precision at scale for enterprises in sectors like insurance, travel, and technology.