Senior Staff AI/ML Scale Engineer

Reposted 8 Days Ago
Be an Early Applicant
2 Locations
In-Office
Senior level
Semiconductor
We create custom semiconductor solutions that move, process, store, and secure data quickly and reliably.
The Role
The role involves simulation, modeling, performance analysis, and tooling for AI/ML workloads, focusing on hardware/software co-design in advanced computing environments.
Summary Generated by Built In

About Marvell

Marvell’s semiconductor solutions are the essential building blocks of the data infrastructure that connects our world. Across enterprise, cloud and AI, automotive, and carrier architectures, our innovative technology is enabling new possibilities. 

At Marvell, you can affect the arc of individual lives, lift the trajectory of entire industries, and fuel the transformative potential of tomorrow. For those looking to make their mark on purposeful and enduring innovation, above and beyond fleeting trends, Marvell is a place to thrive, learn, and lead. 

Your Team, Your Impact

This team at Marvell develops Murals, a next-generation AI/ML infrastructure simulation and design platform that enables in-depth analysis and optimization of large-scale training and inference workloads. Leveraging trace-driven simulation, performance modeling, and hardware/software co-design, the team helps shape scalable and resilient solutions for advanced workloads such as LLMs, DLRMs, GenAI, and GNNs.
Working closely with system architects, hardware designers, and ML practitioners, the team explores innovative ways to optimize compute, memory, and networking subsystems across complex datacenter environments.

What You Can Expect

  • Simulation & Modeling – Implement workflows to study AI/ML workloads using trace-driven and analytical models.

  • Performance Analysis – Profile and analyze system bottlenecks across compute, memory, and network layers.

  • Networking Studies – Evaluate collective communication performance (all-reduce, all-to-all, reduce-scatter) across different topologies and fabrics.

  • Tooling & Automation – Develop utilities for trace generation, merging, conversion, and visualization.

  • Prototype & Validation – Test distributed training and inference pipelines in simulated and real environments.

  • Hardware/Software Co-Design – Collaborate on emerging technologies (CXL, DPUs, NVLink, PCIe, UET/UEC, in-network compute).

  • Scaling Studies – Conduct performance projections and trade-off studies for next-gen AI infrastructure.

  • Knowledge Sharing – Document workflows, publish internal reports, and drive peer learning.

What We're Looking For

  • Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, or related field with 4–12 years of relevant professional experience.

  • Strong foundation in computer architecture, distributed systems, AI/ML, and operating systems.

  • Solid networking fundamentals including TCP/IP, RDMA, RoCE, UET/UEC, and switching/routing.

  • Experience with simulation frameworks (e.g., Astra-Sim, Chakra, gem5, SST, NS-3).

  • Hands-on with PyTorch/TensorFlow and distributed training frameworks (DDP, Horovod, DeepSpeed).

  • Strong programming skills in Python, C++, and scripting for automation.

  • Familiarity with interconnect and memory technologies (CXL, PCIe, NVLink, UAL).

  • Experience with profiling, telemetry, observability, and debugging tools.

  • Knowledge of collective communication algorithms and topology-aware scheduling.

  • Exposure to AI accelerators, memory disaggregation, DPUs, and custom silicon.

  • Familiarity with visualization tools (Perfetto, Chrome Tracing, Chakra Timeline, Flamegraphs).

  • Experience with large-scale AI training pipelines and scaling studies.

  • Interest in energy/performance trade-offs and resilience techniques.

Additional Compensation and Benefit Elements

With competitive compensation and great benefits, you will enjoy our workstyle within an environment of shared collaboration, transparency, and inclusivity. We’re dedicated to giving our people the tools and resources they need to succeed in doing work that matters, and to grow and develop with us. For additional information on what it’s like to work at Marvell, visit our Careers page.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status.

Interview Integrity
 

As part of our commitment to fair and authentic hiring practices, we ask that candidates do not use AI tools (e.g., transcription apps, real-time answer generators like ChatGPT, CoPilot, or note-taking bots) during interviews.
 
Our interviews are designed to assess your personal experience, thought process, and communication skills in real-time. If a candidate uses such tools during an interview, they will be disqualified from the hiring process.

This position may require access to technology and/or software subject to U.S. export control laws and regulations, including the Export Administration Regulations (EAR). As such, applicants must be eligible to access export-controlled information as defined under applicable law. Marvell may be required to obtain export licensing approval from the U.S. Department of Commerce and/or the U.S. Department of State. Except for U.S. citizens, lawful permanent residents, or protected individuals as defined by 8 U.S.C. 1324b(a)(3), all applicants may be subject to an export license review process prior to employment.

#LI-MN1

Top Skills

Ai/Ml
Astra-Sim
C++
Chakra
Cxl
Ddp
Deepspeed
Gem5
Horovod
Ns-3
Nvlink
Pcie
Python
PyTorch
Rdma
Roce
Sst
Tcp/Ip
TensorFlow
Uet
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Santa Clara, CA
6,500 Employees
Year Founded: 1995

What We Do

Marvell specializes in semiconductor solutions that power a wide range of industries, from data centers and 5G networks to AI, automotive, and storage applications. Our cutting-edge products are designed to meet the constantly evolving demands of a connected world, enabling faster, more efficient and more secure data processing and communication. With a focus on excellence and a commitment to advancing technology, we develop solutions that drive progress and transform industries.

Why Work With Us

Life at Marvell means being a part of new innovation and enduring technology; but it's also much more. Our diverse community is strengthened through cultural events, corporate gatherings and team-building activities, fostering collaboration and making work enjoyable. At Marvell, it's not just a job; it's an enriching, community-driven experience.

Gallery

Gallery

Similar Jobs

ServiceNow Logo ServiceNow

Linux Systems Administrator

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Hyderabad, Telangana, IND
27000 Employees

Wells Fargo Logo Wells Fargo

Associate Operations Processor

Fintech • Financial Services
Hybrid
Hyderabad, Telangana, IND
213000 Employees
Hybrid
Hyderabad, Telangana, IND
213000 Employees

Wells Fargo Logo Wells Fargo

Due Diligence Coordinator

Fintech • Financial Services
Hybrid
Hyderabad, Telangana, IND
213000 Employees

Similar Companies Hiring

HRL Laboratories Thumbnail
Software • Semiconductor • Quantum Computing • Machine Learning • Hardware • Defense • Computer Vision
Malibu, CA
1115 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account