Senior AI Infrastructure Engineer (Europe based - Remote)

Reposted 2 Days Ago
Be an Early Applicant
3 Locations
In-Office or Remote
67K-105K Annually
Senior level
Healthtech
The Role
As a Senior AI Infrastructure Engineer at Sword Health, you will optimize and maintain AI model infrastructure, deploy real-time AI agents, and scale GPU clusters, ensuring efficiency and performance for healthcare solutions.
Summary Generated by Built In
Sword Health is shifting healthcare from human-first to AI-first through its AI Care platform, making world-class healthcare available anytime, anywhere, while significantly reducing costs for payers, self-insured employers, national health systems, and other healthcare organizations. Sword began by reinventing pain care with AI at its core, and has since expanded into women’s health, movement health, and more recently mental health. Since 2020, more than 700,000 members across three continents have completed 10 million AI sessions, helping Sword's 1,000+ enterprise clients avoid over $1 billion in unnecessary healthcare costs. Backed by 42 clinical studies and over 44 patents, Sword Health has raised more than $500 million from leading investors, including Khosla Ventures, General Catalyst, Transformation Capital, and Founders Fund. Learn more at www.swordhealth.com.

As a Senior AI Infrastructure Engineer at Sword Health, you will own the infrastructure that brings our AI models to life in production. From optimizing LLM inference and deploying real-time voice AI agents to scaling GPU clusters that serve millions of sessions, your work will directly power the AI Care platform that is transforming healthcare worldwide.

You will sit at the intersection of ML and infrastructure - designing systems that power real-time computer vision for movement analysis, serve large language models for conversational AI, and enable low-latency voice interactions for AI agents. You'll ensure our models run at the speed and scale our members expect. This is not a traditional DevOps role; you'll be deeply embedded in AI-specific challenges like inference optimization, real-time video processing, model serving at scale, and GPU workload orchestration.

If you're passionate about pushing the boundaries of AI infrastructure performance and want to do it in a mission-driven environment where your work directly improves people's health outcomes, we'd love to have you on our team.

What you'll be doing:

  • Design, build, and maintain the inference infrastructure that powers Sword Health's AI products, ensuring models are served with high throughput, low latency, and cost efficiency.

  • Own the end-to-end deployment pipeline for AI models - from real-time computer vision powering movement analysis to large language models driving conversational AI experiences.

  • Architect and scale Kubernetes clusters for GPU-accelerated workloads, including autoscaling strategies, resource scheduling, and multi-model serving.

  • Build and operate the infrastructure behind Sword Health's real-time AI agents, including WebRTC cluster provisioning and deploying speech-to-text and text-to-speech capabilities at low latency.

  • Drive inference scaling strategies - evaluate and implement techniques such as speculative decoding, continuous batching, and model parallelism to meet growing demand without proportionally increasing costs.

  • Develop and maintain Infrastructure as Code (Terraform) and GitOps workflows tailored to GPU-enabled, AI-specific environments.

  • Instrument and monitor AI inference systems, building observability around GPU utilization, model latency, throughput, and error rates to ensure reliability and performance.

  • Collaborate closely with ML Engineers, Data Scientists, and Product teams to translate model requirements into robust, production-ready infrastructure.

  • Evaluate emerging AI infrastructure tools, frameworks, and hardware to keep Sword Health at the cutting edge of inference performance and efficiency.

  • Mentor team members on AI infrastructure best practices, fostering knowledge sharing around GPU workloads, model serving patterns, and production ML systems.

What you need to have:

  • 5+ years of experience in infrastructure engineering, with at least 2 years focused on AI/ML workloads in production environments.

  • Strong experience with Kubernetes for orchestrating GPU-accelerated workloads, including scheduling, resource management, and autoscaling for inference services.

  • Hands-on experience with model serving and inference optimization frameworks for both real-time computer vision and large language model workloads.

  • Solid understanding of LLM inference optimization techniques, including speculative decoding, batching strategies, quantization, and inference scaling patterns.

  • Experience provisioning and managing infrastructure for real-time AI systems, including WebRTC clusters and AI agent architectures.

  • Familiarity with real-time video/computer vision inference pipelines and the infrastructure challenges of processing continuous visual data streams at low latency.

  • Familiarity with speech-to-text and text-to-speech serving infrastructure and the challenges of running voice AI at low latency.

  • Experience with Infrastructure as Code (Terraform or similar) and GitOps methodologies for managing complex, GPU-enabled environments.

  • Working knowledge of GPU infrastructure - NVIDIA CUDA ecosystem, multi-GPU setups, and GPU monitoring/profiling.

  • Strong Linux systems fundamentals and networking knowledge, particularly for latency-sensitive, real-time workloads.

  • Fluent in English (written and oral).

  • A proactive, ownership-driven mindset - you see a bottleneck in an inference pipeline and you fix it before it becomes a problem.

What we would love to see:

    AI Inference & Model Serving:

  • Experience with LLM serving engines such as vLLM, SGLang, or LLM-D.

  • Experience with NVIDIA Triton Inference Server and TensorRT for real-time computer vision workloads.

  • Familiarity with NVIDIA Riva or similar platforms for STT/TTS serving.

  • Understanding of speculative decoding, continuous batching, quantization, and model parallelism techniques.

  • Kubernetes & Infrastructure:

  • Experience with Istio or similar service mesh.

  • Experience with Kafka for event streaming.

  • Experience with Prometheus, AlertManager, and Grafana for monitoring and observability.

  • Experience with Elasticsearch, Logstash, and Kibana (ELK) for log management.

  • Experience with Vault for secrets management.

  • Experience with Redis, MySQL, and DNS management.

  • Experience provisioning infrastructure on AWS, Azure, or GCP.

  • Good knowledge of cloud networking including VPC management, routing, NAT, and troubleshooting with tools like TCPdump.

  • General:

  • Experience with WebRTC infrastructure and real-time media streaming.

  • Experience with Python, Go, or similar languages commonly used in ML infrastructure tooling.

  • Familiarity with SCRUM methodology.

To ensure you feel good solving a big Human problem, we offer:

  • A stimulating, fast-paced environment with lots of room for creativity;

  • A bright future at a promising high-tech startup company;

  • Career development and growth, with a competitive salary;

  • The opportunity to work with a talented team and to add real value to an innovative solution with the potential to change the future of healthcare;

  • A flexible environment where you can control your hours (remotely) with unlimited vacation;

  • Access to our health and well-being program (digital therapist sessions);

  • Remote or Hybrid work policy;

  • To get to know more about our Tech Stack, check here.

Sword Health complies with applicable Federal and State civil rights laws and does not discriminate on the basis of Age, Ancestry, Color, Citizenship, Gender, Gender expression, Gender identity, Gender information, Marital status, Medical condition, National origin, Physical or mental disability, Pregnancy, Race, Religion, Caste, Sexual orientation, and Veteran status.

Top Skills

Ai Infrastructure
Go
Gpu Workloads
Kubernetes
Nvidia Cuda
Python
Speech-To-Text
Terraform
Text-To-Speech
Webrtc
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
New York, NY
197 Employees
Year Founded: 2015

What We Do

SWORD Health is the world’s fastest growing virtual musculoskeletal (MSK) care provider, on a mission to free two billion people from chronic and post-surgical pain. The company’s clinical-grade virtual therapy platform pairs expert physical therapists with FDA-listed wearable technology to deliver a personalized treatment plan that is more effective, easier and less expensive than traditional physical therapy. SWORD Health believes in the power of people to recover at home, without resorting to imaging, surgeries or opioids. Since launching in 2015, SWORD Health has worked with insurers, health systems and employers in the U.S, Europe and Australia to make quality physical therapy more accessible to everyone.

Similar Jobs

Dandy Logo Dandy

Sales Operations Manager

Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
Remote
EU
1800 Employees

Dandy Logo Dandy

Senior Software Engineer

Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
Remote
EU
1800 Employees

Deepgram Logo Deepgram

Solutions Engineer

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
Remote
EU
150 Employees

Deepgram Logo Deepgram

Solutions Architect

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
Remote
EU
150 Employees

Similar Companies Hiring

Camber Thumbnail
Social Impact • Healthtech • Fintech
New York, NY
53 Employees
Sailor Health Thumbnail
Healthtech • Social Impact • Telehealth
New York City, NY
20 Employees
Granted Thumbnail
Mobile • Insurance • Healthtech • Financial Services • Artificial Intelligence
New York, New York
23 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account