ML-focused Site Reliability Engineer - Developer Platforms

Reposted 4 Days Ago
Be an Early Applicant
Bucharest, București
In-Office
Mid level
Artificial Intelligence • Digital Media • Marketing Tech • Software
Adobe is changing the world through digital experiences.
The Role
The Site Reliability Engineer will enhance system reliability and automate incident responses, utilizing machine learning and DevOps methodologies to optimize performance and quality of service.
Summary Generated by Built In

Our Company
Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen. 
We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!


 

The Opportunity

We have a fantastic opportunity for a ML-focused Site Reliability Engineer to join our Developer Platforms team based in Bucharest.

We are looking for an engineer with hands-on experience in machine learning, including designing and training models for real-world applications. The ideal candidate will play a crucial role in developing and implementing anomaly detection systems to proactively spot and address operational issues in intricate infrastructures. This role demands a strong understanding of AI Ops methodologies to optimize performance, automate incident response, and enhance system reliability. Candidates should be enthusiastic about using data to drive intelligent automation and improve service resilience at scale.

The Role:

  • Build outstanding things that matter. You’ll work on a critical growth initiative, solving problems for engineers and customers.

  • Grow. Sharpen your skills, use innovative technology, and collaborate with your peers.

  • Collaborate. Work in an environment that values collaboration.

What you'll do:

  • Ensure the highest level of uptime and Quality of Service (QoS) to Adobe’s customers through operational excellence

  • Architect and build an AI Anomaly-detection system that works on Adobe’s observability data at scale, partnering with other teams to work across boundaries.

  • Define service level objectives (SLOs) and service level indicators (SLIs) to represent and measure service quality

  • Identify areas to improve service resiliency through techniques such as chaos engineering, performance/load testing, anomaly detection, etc

  • Support and maintain globally distributed multi-cloud (public and/or private) environments

  • Automate common, repeatable tasks at a large scale to reduce toil

  • Tackle performance and stability issues using a wide variety of tools

  • Participate in an on-call rotation as required

  • Determine the root cause for all production level incidents and write corresponding high-quality RCA reports

What you'll need to succeed:

  • Hands-on experience with AI anomaly detection and training models

  • Expert in MCP integration, with experience in MCP to MCP communication as a nice to have.

  • Understanding of how to fine-tune signals from observability systems to allow our AI capabilities to scale for Production data.

  • Deep understanding of both software engineering and technical operations

  • DevOps skills (scrum/Kanban/agile/ci-cd/12-factor)

  • Experience in modern cloud-based, SaaS delivery technologies: AWS, Azure, Jenkins, Git, Atlassian Jira and Confluence, Linux, DNS, E-mail, containers, log analysis, monitoring, Java, Apache, Tomcat, Memcached, Qpid, and MySQL on Linux, Prometheus, Grafana, New Relic, Splunk.

  • Expertise with containerization orchestration engines (Kubernetes)

  • Programming skills, particularly with Python, Java, and Ruby

  • Applied skills in machine learning

  • Excellent communication, interpersonal, and teamwork skills

  • Familiar with a variety of cloud and automation concepts, practices, and procedures

Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other applicable characteristics protected by law. Learn more.

Adobe aims to make Adobe.com accessible to any and all users. If you have a disability or special need that requires accommodation to navigate our website or complete the application process, email [email protected] or call (408) 536-3015.

Top Skills

Apache
Atlassian Jira
AWS
Azure
Confluence
Containers
Dns
E-Mail
Git
Grafana
Java
Jenkins
Kubernetes
Linux
Memcached
MySQL
Newrelic
Prometheus
Python
Qpid
Ruby
Splunk
Tomcat
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Jose, CA
21,000 Employees
Year Founded: 1982

What We Do

When you join Adobe Life in Austin, you’ll immerse yourself into a world of cutting-edge technology, exceptional colleagues and meaningful work that touches millions of people everywhere.

Adobe is the global leader in digital media and digital marketing solutions. Our creative, marketing and document solutions empower everyone – from emerging artists to global brands – to bring digital creations to life and deliver immersive, compelling experiences to the right person at the right moment for the best results. In short, Adobe is everywhere, and we’re changing the world through digital experiences.

Why Work With Us

Adobe Austin embodies the culture of the Austin neighborhood around it which is diverse, enterprising and innovative.

Gallery

Gallery

Similar Jobs

CrowdStrike Logo CrowdStrike

Software Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
4 Locations
10000 Employees

Braze Logo Braze

Salesforce Administrator

Marketing Tech • Mobile • Software
Easy Apply
Hybrid
Bucharest, București, ROU
1918 Employees

Mondelēz International Logo Mondelēz International

GenAI ML/LLM Operations Engineer

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Hybrid
5 Locations
90000 Employees

CrowdStrike Logo CrowdStrike

Senior Engineering Manager

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
2 Locations
10000 Employees

Similar Companies Hiring

Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account