Site Reliability Engineer - Embedded

Sorry, this job was removed Sorry, this job was removed at 06:58 p.m. (CST) on Friday, Mar 28, 2025
Be an Early Applicant
Bangalore, Bengaluru, Karnataka
Security • Software • Cybersecurity
The Role

Who we are

We're a leading, global security authority that's disrupting our own category.  Our encryption is trusted by the major ecommerce brands, the world's largest companies, the major cloud providers, entire country financial systems, entire internets of things and even down to the little things like surgically embedded pacemakers.  We help companies put trust - an abstract idea - to work. That's digital trust for the real world.


Job Summary

The Site Reliability Engineer (SRE) collaborates with development teams to embed reliability, scalability, and performance best practices throughout the software development lifecycle. This role bridges software engineering and cloud operations, ensuring mission-critical systems remain highly available and resilient. By integrating reliability early, the SRE fosters a culture of shared responsibility while enabling rapid and safe feature delivery.


What you will do

  • Design and build fault-tolerant, high-performing systems that meet Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • Implement monitoring, alerting, distributed tracing, and logging to ensure real-time system health visibility and proactive issue resolution.
  • Act as a first responder for production incidents, conduct blameless postmortems, and drive root cause analysis (RCA) and corrective actions.
  • Develop self-healing, automated deployments, and scaling solutions to minimize toil and improve system efficiency.
  • Improve continuous integration and deployment pipelines to enable safe, rapid, and reliable feature rollouts.
  • Review code, debug issues, and perform quality assurance (QA) on software components to enhance system reliability and performance.
  • Work closely with development teams to ensure best practices in software architecture, coding standards, and operational readiness.
  • Forecast scalability needs and optimize cloud infrastructure costs while balancing performance and efficiency.
  • Ensure production environments meet security and compliance requirements, collaborating with teams to mitigate vulnerabilities and enforce best practices.
  • Work closely with development teams to embed reliability at every stage rather than treating it as an afterthought.
  • Use error budgets to balance feature velocity with system stability.
  • Implement observability and automation-first principles to measure system health and drive continuous improvement.
  • Leverage game days, chaos engineering, and resilience testing to validate system robustness and refine operational processes.


What you will have

  • 3-5 years of extensive experience in distributed systems, cloud-native architectures (AWS, GCP, Azure), and DevOps practices.
  • Proficiency in Kubernetes, Terraform, CI/CD pipelines, and Infrastructure as Code (IaC).
  • Strong scripting and automation skills in Python, Go, Bash, or similar languages.
  • Expertise in observability tools such as Prometheus, Grafana, Datadog, Splunk, New Relic, and Open Telemetry.
  • Ability to troubleshoot complex production issues and drive scalable, resilient solutions.
  • Experience reviewing code, debugging applications, and conducting software testing to ensure high reliability and quality.


Benefits

  • Generous time off policies
  • Top shelf benefits
  • Education, wellness and lifestyle support


#LI-SD1


Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Lehi, Utah
1,372 Employees
On-site Workplace
Year Founded: 2003

What We Do

DigiCert is the digital trust provider of choice for leading companies around the globe, enabling individuals, businesses, governments, and consortia to engage online with confidence, knowing their digital footprint is secure.

Similar Jobs

Rubrik Logo Rubrik

Test Engineer II

Cloud • Information Technology • Software • Cybersecurity
Bangalore, Bengaluru, Karnataka, IND

Toast Logo Toast

Senior Product Security Engineer

Cloud • Fintech • Food • Information Technology • Software • Hospitality
Bangalore, Bengaluru, Karnataka, IND

ServiceNow Logo ServiceNow

Sr. Identity and Access Management Sailpoint Developer

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Hybrid
Bangalore, Bengaluru, Karnataka, IND

Toast Logo Toast

Staff Software Engineer-Fullstack

Cloud • Fintech • Food • Information Technology • Software • Hospitality
Bangalore, Bengaluru, Karnataka, IND

Similar Companies Hiring

True Anomaly Thumbnail
Software • Machine Learning • Hardware • Defense • Artificial Intelligence • Aerospace
Colorado Springs, CO
131 Employees
Caliola Engineering Thumbnail
Software • Machine Learning • Hardware • Defense • Data Privacy • App development • Aerospace
Colorado Springs, CO
53 Employees
Red 6 Thumbnail
Virtual Reality • Software • Hardware • Defense • Aerospace
Orlando, Florida
113 Employees
Not Eligible
Save
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account