Staff Platform Engineer, AI/ML Infrastructure

Posted 6 Hours Ago
Be an Early Applicant
9 Locations
Hybrid
65K-109K Annually
Senior level
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
We’re in relentless pursuit of breakthroughs that change patients’ lives.
The Role
Lead technical strategy and build scalable AI/ML platform infrastructure on AWS and Kubernetes. Design IaC, CI/CD, observability, security, and cost/capacity practices. Mentor engineers, improve deployment reliability, and enable generative AI workloads, model routing, and operational metrics across multiple environments and regions.
Summary Generated by Built In
Staff Platform Engineer, AI/ML Infrastructure
Department:AI Software & Operations
Role Summary
The Staff Platform Engineer, AI/ML Infrastructure will provide technical leadership for thecloud platforms, deployment systems, and operational foundations that power enterprise-scalegenerative AI applications.
This role will define and evolve the infrastructure architecture for AI/ML platforms running across AWS,Kubernetes, serverless, and containerized environments. The engineer will lead platform standards forreliability, scalability, observability, CI/CD, security, and developer enablement, while partnering closelywith software engineering, AI engineering, security, and operations teams.
The ideal candidate combines deep hands-on cloud engineering experience with staff-level technicalinfluence. They are comfortable designing infrastructure patterns, writing infrastructure-as-code,improving delivery pipelines, mentoring engineers, and making architectural decisions that raise theoperational maturity of AI platforms across multiple teams.
Key Responsibilities
Define and drive the technical strategy for AI/ML platform infrastructure supporting generative AIapplications, LLM integrations, model routing, and enterprise AI services.
Architect, build, and operate scalable cloud platforms using AWS services such as EKS, ECSFargate, Lambda, DynamoDB, S3, OpenSearch, Secrets Manager, CloudWatch, ALB, and MWAA.
Establish reusable infrastructure patterns using CloudFormation, Helm, and Terraform to supportreliable multi-environment and multi-region deployments.
Lead CI/CD architecture using GitHub Actions, reusable workflows, OIDC-based AWSauthentication, automated quality gates, deployment promotion, and environment approvals.
Design and improve observability across AI platforms, including CloudWatch dashboards, logs,alarms, Prometheus/Grafana, OpenSearch, Langfuse, and LLM-specific operational metrics.
Build platform capabilities for GenAI workloads, including model availability monitoring.
Partner with software engineering teams to improve deployment reliability, rollback strategies,health checks, autoscaling, load testing, and runtime performance.
Define and enforce security and compliance practices for infrastructure, including IAM permissionboundaries, Secrets Manager usage, secret scanning, audit logging, tagging standards, andchange-management controls.
Provide technical leadership for cost optimization, capacity planning, environment standardization,and operational resilience across development, test, production, and sandbox environments.
Mentor engineers, review architecture and infrastructure designs, and influence platformengineering practices across teams.
Basic Qualifications
Bachelor's degree in Computer Science, Engineering, Information Technology, or a relatedtechnical field, or equivalent practical experience.
7+ years of experience in DevOps, platform engineering, cloud infrastructure, site reliabilityengineering, or software engineering roles.
Strong hands-on experience with AWS/Azure/GCP infrastructure and services, including container,serverless, networking, storage, observability, and security services.
Experience designing and operating production systems on Kubernetes, ECS/Fargate, orcomparable container orchestration platforms.
Proficiency with infrastructure-as-code, especially CloudFormation, Terraform, Helm, or similartooling.
Strong CI/CD experience with GitHub Actions or similar platforms, including reusable workflows,automated testing, deployment gates, and cloud authentication.
Experience building and operating observability solutions using CloudWatch, Prometheus/Grafana,OpenSearch, or similar tools.
Strong understanding of cloud security practices, IAM, secrets management, least-privilegeaccess, audit logging, and compliance requirements.
Experience supporting distributed systems, microservices, APIs, asynchronous workloads, andmulti-environment deployments.
Demonstrated ability to lead technical design, mentor engineers, and influence engineeringpractices across teams.
Preferred Qualifications
Experience supporting AI/ML or generative AI platforms, including LLM gateways, model routing,prompt observability, token metering, or model failover.
Experience operating platforms in regulated enterprise environments, ideally healthcare,pharmaceutical, finance, or life sciences.
Experience with multi-account, multi-region AWS architectures and enterprise governancepatterns.
Experience with cost optimization, autoscaling strategies, capacity planning, and cloud budgetmonitoring.
Experience with load testing and performance validation using tools such as Locust or comparableframeworks.
Strong Python or scripting skills for platform automation, operational tooling, and CI/CD extensions.
Ability to communicate complex technical decisions clearly to engineering, security, operations,and leadership audiences.
Technical Environment
This role works across a modern AI platform ecosystem including: Cloud:
AWS EKS, ECS Fargate, Lambda, DynamoDB, S3, OpenSearch, CloudWatch, SecretsManager, ALB, VPC, IAM
Infrastructure-as-Code: CloudFormation, Helm, Terraform
CI/CD: GitHub Actions, reusable workflows, OIDC federation, environment approvals, automatedrelease promotion
AI/ML Platform: AWS Bedrock, Azure OpenAI, LiteLLM, Langfuse
Observability: CloudWatch dashboards and alarms, Prometheus, Grafana, OpenSearch, Langfuse,custom metrics
Security & Governance: IAM permission boundaries, secret scanning, audit logging, taggingcompliance, change-management automation
Engineering Practices: Docker, Python, pre-commit, automated testing, load testing, code qualitygates, monorepo service standards
Leadership Expectations
As a J090 Staff-level engineer, this role is expected to operate beyond individual delivery. The engineerwill identify systemic platform gaps, define technical direction, create reusable standards, and raiseengineering maturity across multiple teams.
Success in this role requires strong judgment, ownership, and communication. The engineer should beable to balance hands-on implementation with architectural leadership, guide teams through ambiguoustechnical decisions, and build platform capabilities that make AI product teams faster, safer, and morereliable.
Work location assignment : Remote
The annual base salary for this position ranges from €65.250,00 to €108.750,00. This salary range applies to the location France - Rives de Paris. We also offer a range of benefits and programs to meet colleagues' needs. Benefits vary by location and can include health care coverage, retirement savings plans, insurance benefits, an Employee Assistance Program, wellness benefits and more. Additional details about total compensation and benefits will be provided during the hiring process. Pfizer compensation structures and benefit packages are aligned based on the location of hire. Final compensation will be determined based on the successful candidate's relevant skills, experience, and qualifications, in accordance with pay equity principles and applicable employment laws. This role is posted in multiple locations. If you are applying for the role in an secondary job posting location where pay transparency regulations apply, your Talent Advisor will share the local pay information with you during the first interview.
Pfizer is an equal opportunity employer and complies with all applicable equal employment opportunity legislation in each jurisdiction in which it operates.
Égalité des chances & Emploi
Nous croyons que des équipes diversifiées et inclusives sont essentielles à la réussite d'une entreprise. En tant qu'employeur, Pfizer s'engage à valoriser la diversité et l'inclusion sous toutes ses formes. Cette diversité se reflète également à travers les patients et les communautés que nous servons. Ensemble, continuons à bâtir une culture qui encourage, soutient et responsabilise nos employés.
Handicap & Inclusion
Notre mission est de libérer le potentiel de nos collaborateurs et nous sommes fiers d'être un employeur inclusif pour les personnes handicapées, garantissant ainsi l'égalité des chances en matière d'emploi pour tous les candidats. Nous vous encourageons à donner le meilleur de vous-même en sachant que nous apporterons tous les ajustements raisonnables pour soutenir votre candidature et votre carrière future. Votre expérience avec Pfizer commence ici !
Pfizer endeavors to make www.pfizer.com/careers accessible to all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process and/or interviewing, please email [email protected]. This is to be used solely for accommodation requests with respect to the accessibility of our website, online application process and/or interviewing. Requests for any other reason will not be returned.
Pour mieux comprendre les usages autorisés et interdits de l'intelligence artificielle tout au long du processus de recrutement, nous vous invitons à consulter nos bonnes pratiques dédiées à l'utilisation de l'IA par les candidats sur Pfizer Careers .
Information & Business Tech

Skills Required

  • Bachelor's degree in Computer Science, Engineering, IT or equivalent experience
  • 7+ years in DevOps, platform engineering, cloud infrastructure, SRE, or software engineering
  • Hands-on experience with AWS/Azure/GCP infrastructure and services
  • Experience designing and operating production systems on Kubernetes, ECS/Fargate, or similar
  • Proficiency with infrastructure-as-code (CloudFormation, Terraform, Helm)
  • Strong CI/CD experience with GitHub Actions, reusable workflows, OIDC-based cloud authentication
  • Experience building and operating observability solutions (CloudWatch, Prometheus/Grafana, OpenSearch, Langfuse)
  • Strong understanding of cloud security practices: IAM, secrets management, least-privilege, audit logging
  • Experience supporting distributed systems, microservices, asynchronous workloads, multi-environment deployments
  • Demonstrated ability to lead technical design, mentor engineers, and influence cross-team engineering practices
  • Experience supporting AI/ML or generative AI platforms (LLM gateways, model routing, prompt observability, token metering)
  • Experience operating platforms in regulated enterprise environments (healthcare, pharma, finance, life sciences)
  • Experience with multi-account, multi-region AWS architectures and enterprise governance patterns
  • Experience with cost optimization, autoscaling, capacity planning, and cloud budget monitoring
  • Experience with load testing and performance validation (e.g., Locust)
  • Strong Python or scripting skills for platform automation and tooling
  • Ability to communicate complex technical decisions to engineering, security, operations, and leadership

What the Team is Saying

Daniel
Anna
Esteban
Pfizer

Pfizer Compensation & Benefits Highlights

  • Healthcare Strength Health coverage includes comprehensive medical with robust mental‑health networks, plus dental and vision options, and coverage for infertility/family‑building and transgender‑affirming care. Recent U.S. summaries name mental‑health partners and outline multiple plan choices.
  • Retirement Support The retirement program provides a 401(k) with company match plus an additional employer Retirement Savings Contribution, along with financial‑planning support and company‑paid life and disability insurance. These elements are highlighted as part of the core U.S. package.
  • Parental & Family Support Parental leave is described as up to 26 weeks in the U.S. when combining paid non‑medical parental leave with medical recovery where applicable, with exact pay and weeks dependent on circumstances and plan elections. Family‑building support includes egg preservation, adoption, and surrogacy coverage.

Pfizer Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: New York, NY
121,990 Employees
Year Founded: 1848

What We Do

Our purpose ensures that patients remain at the center of all we do. We live our purpose by sourcing the best science in the world; partnering with others in the healthcare system to improve access to our medicines; using digital technologies to enhance our drug discovery and development, as well as patient outcomes; and leading the conversation to advocate for pro-innovation/pro-patient policies.

Why Work With Us

We are the inventors, the problem solvers, the big thinkers — those who surmount any hurdle to deliver breakthrough medicines to the people who are counting on them the most.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery

Pfizer Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

Typical time on-site: Not Specified
Company Office Image
HQHudson Yards
Provincia de Buenos Aires
Andover, MA
Athens, GR
Chennai, IN
Collegeville, PA
Cork, IE
Dublin, IE
Durham, NC
Groton, CT
Kildare, IE
Madison, NJ
Madrid, ES
Mumbai, Maharashtra
Rochester, MI
San Diego, CA
Seattle, WA
Company Office Image
Heights Union East
Center for Digital Innovation
Learn more

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account