Site Reliability Engineer (GenAI)

Posted 6 Days Ago
Be an Early Applicant
Irving, TX
3-5 Years Experience
AdTech • Marketing Tech
The Role
The Site Reliability Engineer (SRE) ensures the reliability, scalability, and availability of services across cloud and on-prem platforms, focusing on automation, observability, and infrastructure management. Responsibilities include using tools like Ansible and Python for automation, setting up Grafana dashboards, and managing applications on OpenShift, while optimizing resource utilization and collaborating in Agile sprint planning.
Summary Generated by Built In

Company Description

Publicis Sapient is a digital transformation partner helping established organizations get to their future, digitally enabled state, both in the way they work and the way they serve their customers. We help unlock value through a start-up mindset and modern methods, fusing strategy, consulting and customer experience with agile engineering and problem-solving creativity. United by our core values and our purpose of helping people thrive in the brave pursuit of next, our 20,000+ people in 53 offices around the world combine experience across technology, data sciences, consulting, and customer obsession to accelerate our clients’ businesses through designing the products and services their customers truly value.

Job Description

The Site Reliability Engineer (SRE) will be responsible for ensuring the reliability, scalability, and availability of services across cloud and on-prem platforms, with a focus on OpenShift and Grafana. The role combines expertise in automation, observability, and infrastructure management to optimize resource allocation and maintain service uptime. The ideal candidate will have experience working with both cloud (GCP) and on-prem environments, particularly in managing AI/ML and GPU-based workloads.

Responsibilities:

  • Automation & Scripting: Use tools like Ansible and Python to automate provisioning, monitoring, and scaling tasks. Create reusable automation scripts for efficient infrastructure management.
  • Observability & Monitoring: Set up Grafana dashboards and Prometheus alerts to track service health, uptime, and performance metrics across platforms.
  • Infrastructure Management: Deploy and manage applications on OpenShift or other Kubernetes-based platforms, ensuring efficient application lifecycle management.
  • Platform & Service Monitoring: Implement and automate monitoring for both cloud and on-prem environments, ensuring compliance with SLA requirements.
  • Capacity Planning & Resource Management: Monitor and optimize GPU and CPU utilization, ensuring resources are allocated efficiently across workloads.
  • Collaboration & Sprint Planning: Participate in Agile/Scrum sprint planning, collaborating with other teams to ensure tasks are delivered on time and aligned with service-level objectives.
  • Process Automation: Automate manual processes such as resource requests, tenant onboarding, and lifecycle management for AI/ML platforms and other workloads.

Qualifications

Qualifications:

  • Strong experience with automation tools like Ansible and Python scripting for infrastructure management.
  • Proficiency in Grafana and Prometheus for monitoring and setting up alerting mechanisms.
  • Hands-on experience managing applications in OpenShift or other Kubernetes-based platforms.
  • Ability to automate service monitoring and infrastructure scaling in both cloud and on-prem environments, ensuring SLA compliance.
  • Experience with infrastructure management for cloud (GCP) and hybrid environments.
  • Experience with infrastructure as code (IaC) tools (Terraform).

Additional Information

Flexible vacation policy; time is not limited, allocated, or accrued
• 16 paid holidays throughout the year
• Generous parental leave and new parent transition program
• Tuition reimbursement
• Corporate gift matching program

Base Pay Range: USD 75,000 - 146,000 (varies depending on experience) 

The range shown represents a grouping of relevant ranges currently in use at Publicis Sapient. Actual range for this position may differ, depending on location and specific skillset required for the work itself. 

As part of our dedication to an inclusive and diverse workforce, Publicis Sapient is committed to Equal Employment Opportunity without regard for race, color, national origin, ethnicity, gender, protected veteran status, disability, sexual orientation, gender identity, or religion. We are also committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If you need assistance or an accommodation due to a disability, you may contact us at [email protected] or you may call us at +1-617-621-0200. 

Top Skills

Python
The Company
HQ: Paris
45,929 Employees
On-site Workplace
Year Founded: 1926

What We Do

As a platform at the intersection of marketing and digital business transformation, driven through the alchemy of creativity and technology, Publicis Groupe is built on The Power of One. Publicis Groupe offers its clients seamless access to the expertise of its 80,000 talents across four Solution hubs: creative with Publicis Communications (Publicis Worldwide, Saatchi & Saatchi, Leo Burnett, BBH, Marcel, Fallon, MSL, Prodigious), media services with Publicis Media (Starcom, Zenith, Spark Foundry, Blue 449, Performics, Digitas), digital business transformation with Publicis.Sapient and health & wellness communications with Publicis Health. Publicis Groupe’s agencies are present in over 100 countries around the world.

Jobs at Similar Companies

Effectv Logo Effectv

Analyst, Measurement & Insights

AdTech • Digital Media • Marketing Tech
Hybrid
Towson, MD, USA
2157 Employees
57K-99K Annually
Remote
New York, NY, USA
97K-141K Annually
Remote
Sydney, New South Wales, AUS

Similar Companies Hiring

Effectv Thumbnail
Marketing Tech • Digital Media • AdTech
New York, NY
2157 Employees
RollWorks Thumbnail
Marketing Tech
US
AdRoll Thumbnail
AdTech
US

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account