Senior Engineer – Observability Engineering

Posted 4 Days Ago
Be an Early Applicant
3 Locations
In-Office or Remote
Senior level
eCommerce • Fashion • Retail
The Role
Design, build, and maintain end-to-end observability across metrics, logs, traces, synthetic and RUM. Lead New Relic and AIOps adoption, integrate observability into CI/CD/GitOps, mentor teams, evaluate tools, and drive operational excellence and governance.
Summary Generated by Built In

Job Location: Latin America

 

Calling all originals: At Levi Strauss & Co., you can be yourself — and be part of something bigger. We’re a company of people who like to forge our own path and leave the world better than we found it. Who believe that what makes us different makes us stronger. So add your voice. Make an impact. Find your fit — and your future. 

You will be part of the Observability Engineering Team within the Levi’s Shared Platform & Services Technology Organization—a group committed to delivering end-to-end visibility, actionable insights, and operational excellence across our digital ecosystem. The team’s mission is to empower engineering squads with the data, dashboards, and analytics they need to detect, diagnose, and resolve issues faster—ultimately improving reliability, performance, and customer experience. 

 

This role offers a unique opportunity for individuals who are passionate about Observability, Site Reliability, and Data-Driven Engineering to shape how Levi’s measures and manages its digital systems. You will play a key role in adoption of AI within our Observability capabilities and advancing Levi’s Observability Strategy—driving proactive monitoring, intelligent alerting, and continuous improvement across our platforms and services. 

 

This role demands strong expertise in applying AI-powered observability and AIOps solutions to enable proactive monitoring, intelligent alerting, automated incident analysis, and self-healing capabilities, driving faster issue resolution and improved engineering efficiency. 

 

As a member of the Observability Engineering Team, you will join a growing community focused on building a culture of visibility, accountability, and performance excellence, enabling Levi’s engineering squads to move with confidence and speed in our ongoing digital transformation journey. 

 

About the Job 

  • Design, build, and maintain end-to-end observability solutions covering metrics, logs, traces, synthetic monitoring, real user monitoring (RUM), and business telemetry using New Relic and modern cloud-native monitoring services. 

  • Develop and implement instrumentation of Observability standards for microservices, APIs, front-end, and legacy systems, ensuring consistent visibility across hybrid-cloud environments. 

  • Leverage AI driven observability capabilities within New Relic and modern AIOps platforms to enable intelligent anomaly detection, predictive alerting, automated root cause analysis, noise reduction, and self-healing workflows, improving incident response efficiency and accelerating developer productivity. 

  • Integrate observability into CI/CD and GitOps workflows, enabling automated monitoring, alerting, and feedback loops throughout the software delivery pipeline. 

  • Lead cross-functional collaboration with platform, Developer Velocity and application teams to establish scalable monitoring architectures and operational excellence practices. 

  • Evaluate, select, and implement next-generation observability tools and frameworks aligned with enterprise architecture, security, and scalability goals. 

  • Mentor and guide engineering teams on observability best practices, fostering a data-driven, performance-oriented culture across the technology organization. 

  • Establish and advance observability governance and maturity models, ensuring compliance with SLAs. 

  • Partner with technology and business leadership to translate observability insights into actionable improvements driving product quality, developer velocity, and operational efficiency. 

 

About You 

  • 7+ years of total IT industry experience with a strong focus on Observability and monitoring solutions.   

  • 5+ years of solid hands-on experience in administrating, managing and integrating the New Relic Solutions in GCP , Azure & AWS Cloud. 

  • Experience in managing Observability Solutions like Datadog,  Dynatrace , Grafana , Prometheus etc. 

  • Deep understanding of the “Four pillars” of Observability—Metrics, Events, Logs, and Traces—and how they interconnect to drive reliability and performance insights. 

  • Strong background in distributed systems, microservices, and containerized applications (Kubernetes, EKS, Docker, service mesh architectures). 

  • Knowledge of automation, CI/CD, and DevSecOps practices in integrating observability into modern delivery pipelines. 

  • Excellent communication, collaboration & leadership skills with the ability to translate business observability needs into technical solutions 

  • Strong troubleshooting, analytical and problem-solving abilities. 

  • Understanding of SDLC, Agile Practices & ITIL Frameworks. 

  • Technical Certifications highly preferred: New Relic Certified Reliability Engineer, New Relic verified Foundation, New Relic Certified Performance Engineer. 

  • Required: Bachelor’s degree in computer science or equivalent; master’s degree preferred. 

LOCATIONMexico, D.F., MexicoFULL TIME/PART TIMEFull timeCurrent LS&Co Employees, apply via your Workday account.

Skills Required

  • 7+ years of total IT industry experience with strong focus on observability and monitoring solutions.
  • 5+ years hands-on experience administering, managing and integrating New Relic in GCP, Azure, and AWS.
  • Experience managing observability solutions such as Datadog, Dynatrace, Grafana, and Prometheus.
  • Deep understanding of Metrics, Events, Logs, and Traces (the four pillars of observability).
  • Strong background in distributed systems, microservices, and containerized applications (Kubernetes, EKS, Docker, service mesh).
  • Knowledge of automation, CI/CD, GitOps, and DevSecOps practices for integrating observability into delivery pipelines.
  • Experience applying AI-driven observability / AIOps for anomaly detection, predictive alerting, automated root cause analysis, and self-healing.
  • Experience evaluating, selecting, and implementing observability tools aligned with enterprise architecture, security, and scalability.
  • Excellent communication, collaboration, and leadership skills to partner with platform and application teams.
  • Strong troubleshooting, analytical, and problem-solving abilities.
  • Understanding of SDLC, Agile practices, and ITIL frameworks.
  • Bachelor's degree in Computer Science or equivalent.
  • Master's degree.
  • New Relic certifications (Certified Reliability Engineer, Verified Foundation, Certified Performance Engineer).
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Broadmead
0 Employees

What We Do

We’re a company of people who like to forge our own path. We invented the blue jean in 1873, and we reinvented khaki pants in 1986. We pioneered labor and environmental guidelines in manufacturing. And we work to build sustainability into everything we do. We just might be the original startup.

Similar Jobs

PwC Logo PwC

Consultant

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote or Hybrid
60 Locations
370000 Employees
77K-202K Annually

PwC Logo PwC

Data Engineer

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote or Hybrid
65 Locations
370000 Employees
99K-232K Annually

PwC Logo PwC

Data Engineer

Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Remote or Hybrid
67 Locations
370000 Employees
77K-202K Annually

Vantor Logo Vantor

Sales Engineer

Aerospace • Artificial Intelligence • Computer Vision • Software • Analytics • Defense • Big Data Analytics
Remote
2 Locations
2500 Employees

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Artificial Intelligence • eCommerce • Fintech • Payments • Retail • Software • Analytics
US
35 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account