ML Performance Engineer

Reposted 2 Days Ago
Be an Early Applicant
Amsterdam
In-Office
Mid level
Fintech • Software • Financial Services
The Role
Optimize ML model training pipelines for speed and reliability, applying GPU optimization techniques and mentoring team members on performance best practices.
Summary Generated by Built In

We’re looking for a performance-focused ML Engineer to help speed up large-scale model training by optimizing our internal stack and compute infrastructure. You’ll work across the full training pipeline — from GPU kernels to system-level throughput — applying profiling, CUDA-level tuning, and distributed systems techniques. The goal is to reduce training time, boost iteration speed, and use compute more efficiently.

This is a key role in a growing team building deep technical expertise in ML training systems.

Responsibilities

  • Optimize our model training pipeline to improve both speed and reliability, enabling faster and more efficient experimentation;
  • Apply GPU-level optimization techniques using tools like JAX, Triton, low-level CUDA to improve training performance and efficiency at scale;
  • Identify and resolve performance bottlenecks across the entire ML pipeline — from data loading and preprocessing to CUDA kernels;
  • Build tools and extend internal infrastructure to support scalable, reproducible, and high-performance training workflows;
  • Mentor and support engineers and researchers in adopting performance best practices across the team;
  • Help grow the team’s GPU and systems-level capabilities, and contribute to a culture of engineering excellence and rapid experimentation.

Requirements

  • Demonstrated experience optimizing neural network training in production or large-scale research settings - e.g. reducing training time, improving hardware utilization, or accelerating feedback cycles for ML researchers;
  • Extensive practical experience with ML frameworks such as PyTorch or JAX;
  • Hands-on experience with training and optimizing deep learning architectures such as LSTM and Transformer-based models, including different attention mechanisms;
  • Experience working with CUDA, Triton, or other low-level GPU technologies for performance tuning;
  • Proficiency in profiling and debugging training pipelines, using tools such as Nsight/cprofiler/CUDA/gdb/torch profiler;
  • Understanding of distributed training concepts (e.g. data/model/tensor/sequence/pipeline/context parallelism, memory and compute tradeoffs);
  • A collaborative and proactive mindset, with strong communication skills and the ability to mentor teammates and partner effectively within the team;
  • Strong proficiency in Python for building infrastructure-level tooling, debugging training systems, and integrating with ML frameworks and profiling tools;

What we offer

  • High base salary and social benefits;
  • Generous bonus structure. We are very flexible in discussing salary and conditions of employment;
  • Cutting-edge hardware and software in production as well as high technical expertise of the company which allows implementation of bold ideas and boosting great results. Ownership over initiatives that directly solve business problems;
  • Ability to trade on dozens of international exchanges;
  • Flexible workflow (lack of formalism and bureaucracy, no pressure and over-management) and working schedule;
  • Tuition reimbursement, conference and training sponsorship.

Top Skills

Cprofiler
Cuda
Gdb
Jax
Low-Level Gpu Technologies
Nsight
PyTorch
Torch Profiler
Triton
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Amsterdam
117 Employees
Year Founded: 2008

What We Do

We’re Pinely, an algorithmic trading firm, privately owned and funded.
As a proprietary trading firm, we’re not using capital from clients or external investors to trade. That makes all of Pinely ours: our ideas, our money, our technology. All built and thought out by our people.

We trade on the world’s financial markets using our in-house developed research and technology. Most of our strategies are based on HFT (High Frequency Trading) algorithms and depend on our ultra-low latency networks to operate optimally.
Active in various financial markets and products, the Pinely family consists of several firms and offices in Singapore, Cyprus and the Netherlands, sharing the same base technology.
Every day, we put our algorithmic research and technology to the test. Every day, we face the world’s financial markets with our money on the line.
And every day, we come out on top. That’s because we’re driven by the best researchers and powered by the best technologists. But most of all, it’s because we love what we do!

Similar Jobs

In-Office
6 Locations
428 Employees

Kraft Heinz Logo Kraft Heinz

Project Manager

Big Data • Cloud • Food • Machine Learning • Software • Database • Analytics
Hybrid
Amsterdam, NLD
38000 Employees

Kraft Heinz Logo Kraft Heinz

Marketing Project Management intern

Big Data • Cloud • Food • Machine Learning • Software • Database • Analytics
Hybrid
Amsterdam, NLD
38000 Employees

Kraft Heinz Logo Kraft Heinz

Category Manager

Big Data • Cloud • Food • Machine Learning • Software • Database • Analytics
Hybrid
Amsterdam, NLD
38000 Employees

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Rain Thumbnail
Web3 • Payments • Infrastructure as a Service (IaaS) • Fintech • Financial Services • Cryptocurrency • Blockchain
New York, NY
40 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account