Software Engineer - SRE (Python)

Reposted 15 Days Ago
Be an Early Applicant
Bengaluru, Bengaluru Urban, Karnataka, IND
In-Office
Mid level
Software
The Role
Improve reliability, scalability, performance, and observability of JFrog SaaS. Implement SRE practices (SLOs/SLIs, postmortems), automate operations with Python/Go, support Kubernetes multi-cloud environments, run resilience/chaos tests, build agentic automation PoCs, participate in on-call incident response, and collaborate with engineering to reduce toil and improve disaster recovery.
Summary Generated by Built In

Fast-Frogward Your Career to Years From Now

JFrog is the only end-to-end software supply chain platform that provides complete visibility, security, and control for automating the delivery of trusted releases from code to production. Our platform enables organizations to manage, secure, and automate their software delivery process, fueling innovation without worry. We empower companies to build and release software faster and more securely than ever before.

With over 7,500 customers worldwide, including many Fortune 100 companies, JFrog is at the forefront of global innovation. Join us in shaping the future of software delivery and contributing to solutions that empower some of the world's most influential industries.

Be part of a team where your work takes centre stage, shaping the future of software development. At JFrog, as a Software Engineer - SRE, you’ll solve critical challenges for leaders like Amazon, Google, and Netflix. Every day brings opportunities to innovate and push boundaries in a fast-moving, frogward-thinking culture. It’s more than writing code—it’s driving the technology that powers the world. If you want your work to matter and thrive on nonstop innovation, JFrog is your place.

We’re hiring a Software Engineer - SRE (Python) to help improve the availability, performance, scalability, and operational excellence of our SaaS environments. You’ll work closely with Engineering and Cloud teams to automate operations, strengthen observability, and improve incident response using modern SRE practices (SLOs/SLIs, error budgets, postmortems). This role is hands-on, collaborative, and impact-focused. If you're eager to make a significant impact in a fast-paced, high-growth environment, we encourage you to apply.

As a Software Engineer - SRE in JFrog you will be responsible for:
  • Develop and maintain Python/Go automation to improve deployment safety, incident response and operational visibility.
  • Improve reliability, scalability, performance, and observability for JFrog SaaS services in partnership with engineering teams.
  • Implement SRE practices: define SLOs/SLIs, run failure analysis, support capacity planning, perform service readiness reviews and drive tech-debt reliability improvements.
  • Support day-to-day operations of our Multi Cloud Global Distributed Cloud Native Kubernetes-based SaaS environments to keep services available, performant, cost efficient and scalable.
  • Build and enhance internal services and tools to streamline operations and reduce toil through automation.
  • Run PoCs, prototype, and drive implementations for agentic automation using an ADK/agent framework, leveraging AI where it meaningfully improves operational & strategic excellence.
  • Support resilience testing/chaos experiments (as appropriate) and improve disaster recovery readiness.
  • Participate in on-call, lead incidents to resolution, and drive postmortems and follow-up actions that prevent recurrence.
  • Act as a primary contact for SaaS production issues, collaborating closely with Product sengineering groups.
  • Evaluate cloud-native technologies and vendor solutions that improve SaaS reliability and lifecycle management.

To be a Software Engineer - SRE in JFrog you need...

  • Experience: 4+ years in large-scale production environments.
  • Cloud & Orchestration: Production experience with Kubernetes (Docker) and at least one cloud provider (AWS, GCP, or Azure).
  • SRE Fundamentals: Working knowledge of SLO/SLI, alerting strategy, incident response, postmortems, and reliability improvements.
  • Development: Proficiency in Python or Go for automation, integrations, and internal tools.
  • Observability: Hands-on with metrics/logs/traces using tools like New Relic, Coralogix, Prometheus, Grafana, OpenTelemetry (or equivalents).
  • Incident & Resilience: Strong incident response and triage using PagerDuty/Opsgenie (or equivalent);
    Exposure to chaos/resilience testing (e.g., Gremlin) and DR readiness.
  • AI/Agentic Ops: Practical use of AI-assisted operations (e.g., log/incident summarization, triage helpers); familiarity building simple agents with an ADK/agent framework (e.g., LangGraph, LangChain, CrewAI, or similar).
  • CI/CD: Working knowledge of microservices delivery using Jenkins, ArgoCD, or equivalent.
  • Soft Skills: Strong documentation (runbooks, postmortems) and a collaborative, independent problem-solving mindset.

NOTE: We are located in Bangalore (Bellandur) and follow a 3 days from office (mandatory), hybrid work model.

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Sunnyvale, California
1,603 Employees
Year Founded: 2008

What We Do

JFrog Ltd. (Nasdaq: FROG), is on a mission to create a world of software delivered without friction from developer to device. Driven by a “Liquid Software” vision, the JFrog Software Supply Chain Platform is a single system of record that powers organizations to build, manage, and distribute software quickly and securely, ensuring it is available, traceable, and tamper-proof. The integrated security features also help identify, protect, and remediate against threats and vulnerabilities. JFrog’s hybrid, universal, multi-cloud platform is available as both self-hosted and SaaS services across major cloud service providers. Millions of users and 7K+ customers worldwide, including a majority of the FORTUNE 100, depend on JFrog solutions to securely embrace digital transformation. Once you leap forward, you won’t go back!

Similar Jobs

eClinical Solutions Logo eClinical Solutions

Data Engineer

Cloud • Healthtech • Professional Services • Software • Pharmaceutical
Easy Apply
Hybrid
Bangalore, Bengaluru Urban, Karnataka, IND
400 Employees

Celonis Logo Celonis

Senior System Engineer

Big Data • Information Technology • Productivity • Software • Analytics • Business Intelligence • Consulting
Hybrid
Bangalore, Bengaluru Urban, Karnataka, IND
3000 Employees

Celonis Logo Celonis

Lead Applied Value Engineer | Healthcare & Lifesciences

Big Data • Information Technology • Productivity • Software • Analytics • Business Intelligence • Consulting
Hybrid
Bangalore, Bengaluru Urban, Karnataka, IND
3000 Employees

Flywire Logo Flywire

Senior Software Engineer

Fintech • Payments • Software
Hybrid
Bengaluru, Bengaluru Urban, Karnataka, IND
1200 Employees

Similar Companies Hiring

Milestone Systems Thumbnail
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account