Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.
Job SummaryAs a Senior Software Engineer in the ML Software Performance Analysis team, you will play a critical role in ensuring end-to-end performance excellence of our proprietary AI hardware and software stack. You will directly report to the Performance Analysis Team Lead and collaborate closely with component teams, including ML Framework developers, Compiler and Runtime teams, Infrastructure engineers, and Product Management. Your work will directly influence the efficiency and scalability of our ML software solutions, significantly impacting our business by enabling reliable and performant AI solutions for customers.
The TeamThe ML Software Performance Analysis team is a part of the wider ML Software Engineering organisation, responsible for delivering optimised, proprietary machine learning solutions. Our team consists of experienced engineers and domain experts focused on rigorous performance benchmarking, in-depth analysis, and cross-layer optimization from single chip to large-scale, distributed systems.
We work closely with both internal partners and external collaborators to ensure our solutions meet the highest standards of performance, efficiency, and scalability.
Our core responsibilities include:
- ML Software Stack Performance Reports – We publish regular reports that provide a comprehensive view of the performance status of the ML software stack
- End-to-End Performance Optimization – We take a holistic approach to performance, ensuring that local optimizations do not lead to global regressions. Our work spans component boundaries, enabling balanced and efficient performance across the entire stack
- Conduct in-depth analysis of performance metrics to identify bottlenecks, inefficiencies, and regression trends across the ML stack
- Collaborate with cross-functional teams to drive end-to-end performance improvements across software components
- Prepare and deliver performance reports, summarizing key findings, trends, and recommendations
- Design, implement, and maintain performance benchmarking tools and infrastructure for large-scale ML software systems
- Investigate and resolve performance-related issues, including CPU utilization, memory usage, and network overhead
- Ensure that local optimizations do not negatively impact overall system performance, applying a global performance perspective
- Provide actionable feedback and guidance to engineering teams to support continuous performance optimization
- A passion for your work and the ability to thrive in uncertain and complex environments
- Strong programming skills in Python/C/C++, with a focus on performance-sensitive applications
- Solid understanding of computer architecture, performance profiling, and low-level system behaviour (CPU, memory, I/O)
- Experience with benchmarking and analysing complex, distributed systems
- Familiarity with Linux-based development environments and tools
- Strong problem-solving skills and ability to interpret and communicate performance data clearly
- Knowledge of ML frameworks (ideally PyTorch) and their performance characteristics
- Experience with performance analysis in GPU-accelerated environments (CUDA, ROCm, etc.)
- Familiarity of hardware performance characteristics especially in ML context including high-speed networking (e.g. RoCE, RDMA)
- Familiarity with distributed computing frameworks (ideally collectives experience)
- Experience building dashboards or visualizations for performance monitoring (e.g., Grafana, Prometheus, or custom tooling)
- Exposure to performance regression tracking and CI pipelines for performance validation
In addition to a competitive salary, Graphcore offers annual leave policy, medical and dental health plans, a gym card, and employee pension (matched up to 4%). We review our benefits on a yearly basis to ensure we offer a valuable and rewarding benefits programme to our employees. We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments.
Top Skills
What We Do
                                    Graphcore has created a new processor, the Intelligence Processing Unit (IPU), specifically designed for artificial intelligence. The IPU’s unique architecture means developers can run current machine learning models orders of magnitude faster. More importantly, it lets AI researchers undertake entirely new types of work, not possible using current technologies, to drive the next great breakthroughs in general machine intelligence.
Our next generation 3D Wafer-on-Wafer Bow IPU systems are helping AI innovators worldwide to build better, more innovative AI solutions, whether their focus is on language and vision, exploring graph neural networks and LSTMs or creating something entirely new.
We believe our IPU technology will become the worldwide standard for artificial intelligence compute. The performance of Graphcore’s IPU is going to be transformative across all industries and sectors whether you are a medical researcher, roboticist or building autonomous cars.
Our team is at the forefront of the artificial intelligence revolution, enabling innovators from all industries and sectors to expand human potential with technology. What we do, really makes a difference.
We're always interested in hearing from exceptional people to join our team.
                                
 
                            





