At Aptiv, we build the software backbone for the next generation of vehicles. These are systems where milliseconds matter and downtime isn't just a metric - it's a safety concern. Our technology reaches millions of vehicles on roads worldwide and the platforms behind them need to be just as resilient as the products they power.
Your Role
We're looking for an Site Reliability Engineer who goes beyond reacting to incidents - someone focused on preventing them in the first place. You should care deeply about observability, not just as dashboards on a wall, but as a way to truly understand how our platform behaves. If you want to work somewhere that takes reliability seriously, keep reading.
You'll join Aptiv's Global DevOps & Platform Engineering team - a cross-functional group responsible for the infrastructure and workflows that powers automotive software delivery at scale.
In your daily work you will:
- Build and improve the observability platform - metrics, logs, traces - so teams have clear, actionable visibility into how their services behave
- Define and track SLOs, SLIs, and SLAs that translate business expectations into concrete engineering targets
- Lead incident response when things go wrong - run blameless post-mortems, find root causes and make sure the same issue doesn't happen again
- Plan capacity ahead of demand - study traffic patterns, forecast growth, and scale infrastructure before it becomes urgent
- Automate what you can - if you're doing it twice, script it; if you're scripting it often, make it a self-service tool
- Mentor other engineers through architecture reviews, knowledge sharing, and helping build a culture of continuous improvement
- Work with development, security, and product teams to keep reliability front and center throughout the software lifecycle
- Partner with DevSecOps to harden infrastructure, manage secrets properly and enforce least-privilege access
Your background
- Deep, hands-on experience running K8s in production - cluster lifecycle, networking (CNI, service mesh), RBAC, resource management, HPA/VPA and debugging pod failures at scale
- Solid skills with EKS, EC2, IAM, VPC, S3, CloudWatch, Route 53, and practical cost optimization
- Building and running monitoring platforms with Prometheus, Grafana, Datadog, ELK/OpenSearch, and ideally OpenTelemetry for distributed tracing
- Real-world Terraform experience (state management, modules, workspaces)
- Track record of leading incident response, running blameless post-mortems and driving measurable reliability gains
- Strong Python and Bash skills with a habit of automating operational workflows end-to-end
- Good grasp of TCP/IP, DNS, load balancing, TLS, firewall rules and zero-trust principles
Why join us?
- You can grow at Aptiv. Aptiv provides an inclusive work environment where all individuals can grow and develop, regardless of gender, ethnicity or beliefs.
- You can have an impact. Safety is a core Aptiv value; we want a safer world for us and our children, one with: Zero fatalities, Zero injuries, Zero accidents.
- You have support. We ensure you have the resources and support you need to take care of your family and your physical and mental health with a competitive health insurance package.
Your Benefits at Aptiv:
- Private health care (Signal Iduna) and Life insurance for you and your beloved ones
- Well-Being Program that includes regular webinars, workshops, and networking events
- Access to sports groups and Multisport card
- Hybrid work (min. 47 days/yr of remote work, flexible working hours)
- Employee Pension Plan paid by the employer (you get + 3,5% on each gross salary)
Apply today, and together let’s change tomorrow!
#LI-MC1
Privacy Notice - Active Candidates: https://www.aptiv.com/privacy-notice-active-candidates
Aptiv is an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, sex, gender identity, sexual orientation, disability status, protected veteran status or any other characteristic protected by law.
Skills Required
- Deep, hands-on experience running Kubernetes in production
- Solid skills with AWS services like EKS, EC2, IAM, VPC, S3
- Experience with monitoring platforms such as Prometheus, Grafana, and Datadog
- Real-world Terraform experience
- Strong Python and Bash skills
- Good grasp of TCP/IP, DNS, and load balancing principles
APTIV Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about APTIV and has not been reviewed or approved by APTIV.
-
Retirement Support — A 401(k) plan with company contribution and competitive matching is described as a notable component of the total rewards package. Equity participation and performance bonuses are also positioned as part of long-term and variable compensation.
-
Healthcare Strength — Core coverage is portrayed as broad, spanning medical, dental, vision, life, and disability insurance. Mental health resources and an Employee Assistance Program are also included as part of wellness support.
-
Leave & Time Off Breadth — Paid holidays, paid sick days, and flexible time-off policies are included in the benefits mix. Flexible scheduling and remote-work programs further support time management and personal needs.
APTIV Insights
What We Do
Aptiv is a global technology company that develops safer, greener and more connected solutions enabling the future of mobility. #ItsOurMove







