Lead Machine Learning Engineer

Reposted 10 Days Ago
Be an Early Applicant
Singapore
In-Office
Senior level
Software
Why does Thoughtworks exist? To create an extraordinary impact on the world through our culture & technology excellence.
The Role
Lead the design and optimization of AI model inference systems, guiding teams and collaborating with cross-functional groups to improve performance, cost efficiency, and scalability.
Summary Generated by Built In

Machine Learning Engineers specializing in Inference Optimization focus on maximizing the efficiency, speed, and cost-effectiveness of deploying AI models across diverse environments. They apply advanced optimization techniques to improve runtime inference and application performance. Their work ensures that clients can scale AI solutions sustainably, whether in the cloud, on-premises, or at the edge.

As a Lead Machine Learning Engineer at Thoughtworks, you’ll combine deep technical capability with team leadership and architectural thinking. You’ll guide teams through complex optimization challenges, design scalable inference systems, and ensure AI solutions are not only high-performing but operationally sustainable. You’ll act as a bridge between hands-on engineering and strategic technical direction, mentoring others while shaping the standards and practices that define excellence in inference engineering.

(Tips: Thoughtworks Singapore will be shortlisting applicants who have a current right to work in Singapore i.e. Singapore Citizens and Singapore Permanent Residents only.)

Job responsibilities
  • Lead the design and implementation of advanced model optimization pipelines, including quantization, pruning, and distillation.Architect and tune inference runtimes and serving frameworks to achieve optimal performance across deployments.
  • Guide teams in implementing high-throughput serving strategies (continuous batching, KV caching, speculative decoding, asynchronous scheduling).
  • Develop benchmarks and performance dashboards to measure and communicate system-level efficiency improvements (throughput, latency, GPU utilization, cost).
  • Evaluate trade-offs across accuracy, performance, and cost, and design architectures to meet target SLAs across varied hardware environments (cloud, on-prem, edge).
  • Collaborate with infrastructure, MLOps, and product teams to embed inference optimization into production workflows and platform designs.
  • Provide technical leadership and mentorship to engineers, fostering a culture of experimentation, rigor, and continuous performance improvement.
  • Contribute to the development of internal frameworks, reference architectures, and playbooks for scalable and cost-efficient inference.
  • Engage with clients to translate optimization outcomes into business value and articulate the ROI of technical improvements.
Job qualifications
Technical Skills
  • Deep practical expertise in model and runtime optimization techniques (quantization, pruning, distillation, batching, caching).
  • Proven experience optimizing inference workloads using frameworks such as vLLM, NVIDIA Triton/Dynamo.
  • Strong proficiency in deep learning frameworks (e.g. PyTorch, TensorFlow) with production deployment experience.
  • Ability to diagnose and optimize performance using profiling tools (e.g. Nsight, PyTorch/TensorFlow profilers).
  • Solid understanding of GPU and accelerator architectures, and experience tuning workloads for cost and performance efficiency.
  • Experience designing and benchmarking scalable inference systems across heterogeneous environments (GPU clusters, serverless, edge).
  • Familiarity with observability stacks, telemetry, and cost instrumentation for AI workloads.
Professional Skills
  • Demonstrated ability to lead small-to-medium engineering teams or technical workstreams.
  • Skilled at balancing hands-on delivery with architectural oversight and mentorship.
  • Strong communication and stakeholder engagement skills and are able to connect low-level optimizations with business impact.
  • Comfortable in ambiguous and fast-evolving technology landscapes, with a passion for applied innovation.
  • Commitment to continuous learning and knowledge sharing across teams and communities.
Other things to know
Learning & Development

There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

About Thoughtworks

Thoughtworks is a dynamic and inclusive community of bright and supportive colleagues who are revolutionizing tech. As a leading technology consultancy, we’re pushing boundaries through our purposeful and impactful work. For 30+ years, we’ve delivered extraordinary impact together with our clients by helping them solve complex business problems with technology as the differentiator. Bring your brilliant expertise and commitment for continuous learning to Thoughtworks. Together, let’s be extraordinary.

#LI-Onsite

See here our AI policy.

Top Skills

Gpu
Nvidia Triton
Profiling Tools
PyTorch
Telemetry
TensorFlow
Vllm
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Chicago, IL
7,674 Employees
Year Founded: 1993

What We Do

We are a leading global technology consultancy that integrates strategy, design and software engineering to enable enterprises and technology disruptors across the globe to thrive as modern digital businesses.

Why Work With Us

As technologists, we have a unique role to play in how technology should benefit all of society, pursuing a more equitable future. Part of that role is to continuously educate ourselves on the issues that matter to the causes we believe in. We recognize our privilege and strive to see the world from the perspective of the most vulnerable.

Gallery

Gallery

Similar Jobs

Grab Logo Grab

Lead Machine Learning Engineer

Fintech • Logistics • Mobile • Payments • Software • Transportation
In-Office
Singapore, SGP
11185 Employees
Hybrid
Singapore, SGP
289097 Employees
3-3 Annually

Wise Logo Wise

Public Relations Manager

Fintech • Mobile • Payments • Software • Financial Services
Hybrid
Singapore, SGP
6500 Employees
8K-11K Annually

Boeing Logo Boeing

Sales Representative

Aerospace • Information Technology • Cybersecurity • Defense • Manufacturing
In-Office
Singapore, SGP
141000 Employees

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account