Senior Principal Network Engineer

Reposted 2 Days Ago
Be an Early Applicant
Austin, TX, USA
Hybrid
Senior level
Artificial Intelligence • Semiconductor
Joining Graphcore gives you a seat at the top-table, shaping the future of Artificial Intelligence.
The Role
Design and optimize AI data center networks, focusing on high-performance computing and network fabrics, while collaborating with cross-functional teams.
Summary Generated by Built In
About us

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.

Job Summary

We are seeking a Senior Principal Network Engineer to help design, deploy, and optimize next‑generation AI data center networks. AI training and inference workloads require extremely high bandwidth, deterministic low latency, and zero‑packet‑loss networking environments.
In this role, you will partner closely with the Network Architecture Lead to design and scale high‑performance computing (HPC) network fabrics supporting GPU clusters. You will work across hardware, networking, and AI application layers to ensure Graphcore’s large‑scale AI infrastructure operates at peak performance.
The ideal candidate brings deep experience operating hyperscale or HPC data center networks and has expertise in high‑speed Ethernet fabrics, RDMA technologies, advanced automation, and telemetry systems.

The Team

The Data Center Network Engineering team designs and operates the high‑performance network fabrics that power Graphcore’s AI compute platforms. The team collaborates closely with hardware engineering, AI researchers, and infrastructure teams to build scalable networking environments optimized for distributed training and inference workloads.
Engineers work on pioneering technologies including high‑speed Ethernet fabrics, lossless networking, RDMA transport, and large‑scale automation frameworks to support next‑generation AI clusters.

Responsibilities and Duties
  • Assist in defining ultra‑high‑bandwidth, non‑blocking AI network fabrics (Clos spine‑leaf‑super‑spine architectures) for large‑scale distributed AI workloads.
    • Optimize performance of lossless Ethernet fabrics using congestion control mechanisms such as PFC, ECN, and DCQCN to support RDMA/RoCEv2 communication.
    • Lead initiatives to implement NetDevOps practices and develop automation for provisioning, configuration management, and network remediation.
    • Design and deploy high‑resolution telemetry pipelines to monitor network health, detect microbursts, and analyze congestion patterns.
    • Support modeling, deployment, configuration, and monitoring of data center network fabrics including scale‑out, scale‑up, and front‑end networks.
    • Collaborate cross‑functionally with hardware engineers, AI researchers, and data center operations teams to co‑design high‑performance infrastructure.
    • Provide technical leadership and mentorship to network engineers while establishing best practices and operational standards.
    • Contribute to the long‑term networking strategy and roadmap for Graphcore’s AI infrastructure.
    • Research and evaluate next‑generation high‑speed networking technologies and vendor solutions.
Candidate ProfileEssential
  • BS or MS or equivalent experience in Computer Science, Electrical Engineering, Network Engineering, or related technical discipline.
    • 12+ years of progressive network engineering experience with at least 3 years in hyperscale, high‑density, or HPC data center environments.
    • Expert‑level knowledge of data center routing and switching protocols including BGP, OSPF, and EVPN‑VXLAN architectures.
    • Strong operational understanding of RDMA networking technologies such as RoCEv2 or InfiniBand.
    • Hands‑on experience with modern merchant silicon networking platforms and NOS platforms such as Arista EOS, Cisco NX‑OS, or SONiC.
    • Experience deploying high‑speed network technologies including 400G/800G optics and large‑scale fabric architectures.
    • Proficiency in automation and scripting languages such as Python, Go, Bash, or similar tools.
    • Strong collaboration and communication skills across cross‑functional engineering teams.
Desirable
  • Experience operating large‑scale AI or GPU clusters.
    • Familiarity with network telemetry frameworks and streaming analytics.
    • Experience implementing NetDevOps workflows and infrastructure automation pipelines.
    • Experience influencing vendor roadmaps or evaluating next‑generation networking technologies.

Skills Required

  • BS or MS in Computer Science, Electrical Engineering, Network Engineering, or related discipline
  • 12+ years of progressive network engineering experience
  • Expert-level knowledge of data center routing and switching protocols
  • Strong operational understanding of RDMA networking technologies
  • Hands-on experience with modern merchant silicon networking platforms
  • Experience deploying high-speed network technologies
  • Proficiency in automation and scripting languages such as Python, Go, Bash

What the Team is Saying

Monika
Dionysia
Dave

Graphcore Compensation & Benefits Highlights

  • Healthcare Strength Health coverage includes medical and dental insurance, with US plans through Cigna and Kaiser, HDHP options with employer‑funded HSA contributions, a health cash plan, EAP access, and dedicated mental‑health support. These provisions extend to family options in some regions, reinforcing broad medical and wellbeing support.
  • Retirement Support Retirement programs include a UK pension match up to 5% and a US 401(k) with a 100% company match up to 6% (with a true‑up). This pairing signals strong, predictable long‑term savings support across key locations.
  • Leave & Time Off Breadth Time‑off policies feature “unlimited” holiday in the UK and flexible, generous PTO with paid US holidays. Paid family leave for birthing parents and bonding further broadens time‑away support.

Graphcore Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Bristol
762 Employees
Year Founded: 2016

What We Do

At Graphcore, we’re building the future of AI compute. We’re a team of semiconductor, software and AI experts, with deep experience in creating the complete AI compute stack - from silicon and software to infrastructure at datacenter scale. As part of the SoftBank Group, backed by significant long-term investment, we are delivering key technology into the fast-growing SoftBank AI ecosystem. To meet the vast and exciting AI opportunity, Graphcore is expanding its teams around the world. We are bringing together the brightest minds to solve the toughest problems, in a place where everyone has the opportunity to make an impact on the company, our products and the future of artificial intelligence.

Why Work With Us

Our team is at the forefront of the machine intelligence revolution, enabling innovators from all industries to build AI-native products to expand human potential. What we do at Graphcore really makes a difference.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Graphcore Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

At Graphcore, we value wellbeing and flexibility to support a healthy work/life balance. Our hybrid approach encourages office-based colleagues to work onsite three days a week, with trusted flexibility built on trust and transparency for everyone.

Typical time on-site: 3 days a week
HQHeadquarters
Austin Office
Bengaluru Office
Cambridge Office
Gdańsk Office
Hsinchu Office
London Office
Learn more

Similar Jobs

Graphcore Logo Graphcore

Senior Systems Engineer

Artificial Intelligence • Semiconductor
Hybrid
Austin, TX, USA
762 Employees

Graphcore Logo Graphcore

Electrical Engineer

Artificial Intelligence • Semiconductor
Hybrid
Austin, TX, USA
762 Employees

Graphcore Logo Graphcore

Designer

Artificial Intelligence • Semiconductor
Hybrid
Austin, TX, USA
762 Employees

Graphcore Logo Graphcore

Staff AI Performance Engineer

Artificial Intelligence • Semiconductor
Hybrid
Austin, TX, USA
762 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account