Principal ML Investigator

Reposted 10 Days Ago
Be an Early Applicant
Sunnyvale, CA, USA
In-Office
Expert/Leader
Artificial Intelligence
The Role
Lead and build a research-focused ML team to develop post-training/RL, dataset curation, LLM pretraining, and sparsity techniques; adapt algorithms to Cerebras hardware, run systematic model evaluation, and collaborate with internal teams and external partners to drive production deployments and hardware/software co-design.
Summary Generated by Built In

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.  

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. 

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

Cerebras is adding an ML team that can focus on a new ML effort that can align with existing teams. We are seeking a principal investigator who will partner with our ML leaders to formulate the new effort and to build up the new team and capabilities. This new team would coordinate with our current ML teams: Field ML, which works directly with customers, Applied ML, which builds new ML capabilities and applications for customers, and Core ML, which adapts ML algorithms to find unique capabilities of Cerebras hardware. The new team could take up the same or complementary responsibilities.

We would like the new team to work on some of the following areas:

  • Post-training and reinforcement learning: Techniques used to improve model deployment quality through further training, tuning, RL, and focus on particular downstream tasks;
  • Dataset curation and optimization: Techniques to collect and select high-quality data, which can help models to train or tune more quickly or to higher quality;
  • LLM Pretraining: Techniques to ensure stability and compute-efficiency while pretraining high quality models. May include training dynamics, parameterizations, numerics, or others;
  • Sparsity: Techniques to sparsify models or data that improve training time-to-quality, or optimize inference speed or throughput;
  • Domains: Coding agents, reasoning agents, generative language, image, video.

Principal Investigator Responsibilities

  • Build up a team capable of industry research and advanced development.
  • Organize various advanced development topics into cohesive agenda.
  • Adapt novel algorithms and model architectures to run on the Cerebras platform.
  • Systematically train, tune, and evaluate models to guide/advise production scenarios.
  • Collaborate with other teams to co-design next-generation hardware and software architectures.
  • Collaborate with external partners (customers, academic) to drive insight and credibility.
Skills & Qualifications
  • PhD in Computer Science or related field.
  • Strong grasp of ML theory in one or more of the above areas.
  • Proven experience engineering ML systems for scale or production deployment.
  • Experience leading a team of researchers or engineers.
Preferred Skills & Qualifications
  • Track record of patents or publications in top-tier conferences or journals.
  • Experience with large language models (e.g., GPT family, Llama).
  • Experience with distributed training concepts and frameworks.
  • Experience in training speed optimizations, such as model architecture transformations to target hardware, or low-level kernel development (e.g., Triton).
  • Ability to analytically model or optimize system performance.
Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection  point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

  1. Build a breakthrough AI platform beyond the constraints of the GPU.
  2. Publish and open source their cutting-edge AI research.
  3. Work on one of the fastest AI supercomputers in the world.
  4. Enjoy job stability with startup vitality.
  5. Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Skills Required

  • PhD in Computer Science or related field.
  • Strong grasp of ML theory in one or more of the listed areas (post-training/RL, dataset curation, LLM pretraining, sparsity, domains).
  • Proven experience engineering ML systems for scale or production deployment.
  • Experience leading a team of researchers or engineers.
  • Track record of patents or publications in top-tier conferences or journals.
  • Experience with large language models (e.g., GPT family, LLaMA).
  • Experience with distributed training concepts and frameworks.
  • Experience in training speed optimizations, model architecture transformations to target hardware, or low-level kernel development (e.g., Triton).
  • Ability to analytically model or optimize system performance.
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Sunnyvale, CA
402 Employees
Year Founded: 2016

What We Do

Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, functional business experts and engineers of all types. We have come together to build a new class of computer to accelerate artificial intelligence work by three orders of magnitude beyond the current state of the art. The CS-2 is the fastest AI computer in existence. It contains a collection of industry firsts, including the Cerebras Wafer Scale Engine (WSE-2). The WSE-2 is the largest chip ever built. It contains 2.6 trillion transistors and covers more than 46,225 square millimeters of silicon. The largest graphics processor on the market has 54 billion transistors and covers 815 square millimeters. In artificial intelligence work, large chips process information more quickly producing answers in less time. As a result, neural networks that in the past took months to train, can now train in minutes on the Cerebras CS-2 powered by the WSE-2. Join us: https://cerebras.net/careers/

Similar Jobs

CrowdStrike Logo CrowdStrike

Security Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Hybrid
Sunnyvale, CA, USA
10000 Employees
100K-145K Annually

Zscaler Logo Zscaler

Development Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
San Jose, CA, USA
8697 Employees
154K-220K Annually

PNC Bank Logo PNC Bank

Software Engineer

Machine Learning • Payments • Security • Software • Financial Services
Remote or Hybrid
USA
55000 Employees

Zscaler Logo Zscaler

Development Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
San Jose, CA, USA
8697 Employees
130K-185K Annually

Similar Companies Hiring

GC AI Thumbnail
Artificial Intelligence • Legal Tech
San Mateo, California
100 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account