Graphcore

Senior Software Engineer

Reposted 16 Days Ago

Be an Early Applicant

Gdańsk, Województwo pomorskie

In-Office

Senior level

Artificial Intelligence • Semiconductor

The Role

The Senior Software Engineer will analyze performance metrics, optimize ML software stack, and collaborate with various teams to enhance software performance and efficiency.

Summary Generated by Built In

About Graphcore

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute. 

It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry. 

As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.  

Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation. 

Job Summary

As a Senior Software Engineer in the ML Software Performance Analysis team, you will play a critical role in ensuring end-to-end performance excellence of our proprietary AI hardware and software stack. You will directly report to the Performance Analysis Team Lead and collaborate closely with component teams, including ML Framework developers, Compiler and Runtime teams, Infrastructure engineers, and Product Management. Your work will directly influence the efficiency and scalability of our ML software solutions, significantly impacting our business by enabling reliable and performant AI solutions for customers.

The Team

The ML Software Performance Analysis team is a part of the wider ML Software Engineering organisation, responsible for delivering optimised, proprietary machine learning solutions. Our team consists of experienced engineers and domain experts focused on rigorous performance benchmarking, in-depth analysis, and cross-layer optimization from single chip to large-scale, distributed systems.

We work closely with both internal partners and external collaborators to ensure our solutions meet the highest standards of performance, efficiency, and scalability.

Our core responsibilities include:

ML Software Stack Performance Reports – We publish regular reports that provide a comprehensive view of the performance status of the ML software stack
End-to-End Performance Optimization – We take a holistic approach to performance, ensuring that local optimizations do not lead to global regressions. Our work spans component boundaries, enabling balanced and efficient performance across the entire stack

Responsibilities and Duties

Conduct in-depth analysis of performance metrics to identify bottlenecks, inefficiencies, and regression trends across the ML stack
Collaborate with cross-functional teams to drive end-to-end performance improvements across software components
Prepare and deliver performance reports, summarizing key findings, trends, and recommendations
Design, implement, and maintain performance benchmarking tools and infrastructure for large-scale ML software systems
Investigate and resolve performance-related issues, including CPU utilization, memory usage, and network overhead
Ensure that local optimizations do not negatively impact overall system performance, applying a global performance perspective
Provide actionable feedback and guidance to engineering teams to support continuous performance optimization

Candidate Profile Essential:

A passion for your work and the ability to thrive in uncertain and complex environments
Strong programming skills in Python/C/C++, with a focus on performance-sensitive applications
Solid understanding of computer architecture, performance profiling, and low-level system behaviour (CPU, memory, I/O)
Experience with benchmarking and analysing complex, distributed systems
Familiarity with Linux-based development environments and tools
Strong problem-solving skills and ability to interpret and communicate performance data clearly

Desirable

Knowledge of ML frameworks (ideally PyTorch) and their performance characteristics
Experience with performance analysis in GPU-accelerated environments (CUDA, ROCm, etc.)
Familiarity of hardware performance characteristics especially in ML context including high-speed networking (e.g. RoCE, RDMA)
Familiarity with distributed computing frameworks (ideally collectives experience)
Experience building dashboards or visualizations for performance monitoring (e.g., Grafana, Prometheus, or custom tooling)
Exposure to performance regression tracking and CI pipelines for performance validation

Benefits

In addition to a competitive salary, Graphcore offers annual leave policy, medical and dental health plans, a gym card, and employee pension (matched up to 4%). We review our benefits on a yearly basis to ensure we offer a valuable and rewarding benefits programme to our employees. We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.

Top Skills

Python,C,C++,Linux,Cuda,Rocm

View all jobs at Graphcore

View Graphcore Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Palo Alto, CA

389 Employees

Year Founded: 2016

What We Do

Graphcore has created a new processor, the Intelligence Processing Unit (IPU), specifically designed for artificial intelligence. The IPU’s unique architecture means developers can run current machine learning models orders of magnitude faster. More importantly, it lets AI researchers undertake entirely new types of work, not possible using current technologies, to drive the next great breakthroughs in general machine intelligence.

Our next generation 3D Wafer-on-Wafer Bow IPU systems are helping AI innovators worldwide to build better, more innovative AI solutions, whether their focus is on language and vision, exploring graph neural networks and LSTMs or creating something entirely new.

We believe our IPU technology will become the worldwide standard for artificial intelligence compute. The performance of Graphcore’s IPU is going to be transformative across all industries and sectors whether you are a medical researcher, roboticist or building autonomous cars.

Our team is at the forefront of the artificial intelligence revolution, enabling innovators from all industries and sectors to expand human potential with technology. What we do, really makes a difference.

We're always interested in hearing from exceptional people to join our team.