About Graphcore
How often do you get the chance to build a technology that transforms the future of humanity?
Graphcore products have set the standard in made-for-AI compute hardware and software, gaining global attention and industry acclaim. Now we are developing the next generation of artificial intelligence compute with systems that will allow AI researchers to develop more advanced models, help scientists unlock exciting new discoveries, and power companies around the world as they put AI at the heart of their business.
Graphcore recently joined SoftBank Group – bringing large and ongoing investment from one of the world’s leading backers of innovative AI companies.
Job Summary
As a Senior Software Engineer in the Collectives Team, you will drive the effort to design and develop the Collectives Communication Library enabling users to utilize large computing clusters. The ideal candidate will have extensive experience in designing, developing, and maintaining complex software systems involving custom hardware. You will be responsible for leading development efforts, mentoring junior engineers, and driving technical excellence in our projects.
The Team
The Collectives team is responsible for building the Collectives Communication library for new AI hardware Graphcore is working on. The library provides communication primitives optimized to achieve high bandwidth and low latency at very high scale.
Responsibilities and Duties
- Designing, implementing, testing and documenting Collectives Communication Library for new AI hardware accelerator
- Collaborating with other teams to design, implement and test new features
- Troubleshooting and resolving complex technical issues
- Ensuring seamless integration of new hardware with existing AI ecosystem
- Participating in agile development – working as part of a scrum team
Candidate Profile
Essential:
- Extensive experience in software development using C++ programming language
- Experience with Python and C programming
- Excellent problem-solving skills and ability to debug and resolve complex issues
- Strong knowledge of multithreading and inter-process communication (IPC) techniques for development of efficient concurrent applications
- Experience with unit testing frameworks such as Boost.Test and Google Test
- Proficiency with build tools such as CMake, Make and Ninja
- Strong understanding of version control systems (preferred Git)
- Ability to work within a multinational team and with multinational customers
- Excellent written and verbal communication skills
Desirable
- Experience with RDMA networking libraries (for example libibverbs, libfabric)
- Knowledge of multithreading and parallel computing concepts, including experience with parallel algorithms and optimization for AI/ML and HPC systems
- Experience with Continuous Integration/Continuous Delivery (CI/CD) pipelines, including setting up automated workflows and deployments (for example GitHub Actions, GitLab CI)
- Experience with communication libraries (for example NCCL, MPI)
- Knowledge of machine learning frameworks (for example PyTorch)
- Knowledge of modern C++ standards 17/20
Benefits
In addition to a competitive salary, Graphcore offers flexible working, a generous annual leave policy, private medical insurance and health cash plan, a dental plan, pension (matched up to 5%), life assurance and income protection. We have a generous parental leave policy and an employee assistance programme (which includes health, mental wellbeing, and bereavement support). We offer a range of healthy food and snacks at our central Bristol office and have our own barista bar! We welcome people of different backgrounds and experiences; we’re committed to building an inclusive work environment that makes Graphcore a great home for everyone. We offer an equal opportunity process and understand that there are visible and invisible differences in all of us. We can provide a flexible approach to interview and encourage you to chat to us if you require any reasonable adjustments
Top Skills
What We Do
Graphcore has created a new processor, the Intelligence Processing Unit (IPU), specifically designed for artificial intelligence. The IPU’s unique architecture means developers can run current machine learning models orders of magnitude faster. More importantly, it lets AI researchers undertake entirely new types of work, not possible using current technologies, to drive the next great breakthroughs in general machine intelligence.
Our next generation 3D Wafer-on-Wafer Bow IPU systems are helping AI innovators worldwide to build better, more innovative AI solutions, whether their focus is on language and vision, exploring graph neural networks and LSTMs or creating something entirely new.
We believe our IPU technology will become the worldwide standard for artificial intelligence compute. The performance of Graphcore’s IPU is going to be transformative across all industries and sectors whether you are a medical researcher, roboticist or building autonomous cars.
Our team is at the forefront of the artificial intelligence revolution, enabling innovators from all industries and sectors to expand human potential with technology. What we do, really makes a difference.
We're always interested in hearing from exceptional people to join our team.