Site Reliability Engineer III

Posted 7 Days Ago
Be an Early Applicant
Hyderabad, Telangana, IND
In-Office
Mid level
Cloud • Information Technology • Security • Software
The Role
The Site Reliability Engineer will design, deploy, and support Kubernetes environments, improve observability, automate processes, and ensure system reliability.
Summary Generated by Built In

At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation. 
 

Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.

Site Reliability Engineer – UDFAbout F5

At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.

Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.

Position Summary

This role will be a new member of our Unified Demo Framework (UDF) platform team supporting the launch and management of the F5 Guardrails and Redteam product lines into UDF. The role will focus on designing, deploying, and supporting Kubernetes environments that support a wide variety of use cases across many F5 teams. As a technical expert, the SRE will work closely with cross-functional teams to instantiate AI features, optimize system performance, and ensure reliability in production environments.

The ideal candidate will have deep expertise in Kubernetes orchestration, containerized architectures, and builds and runs systems with an operational excellence mindset. This individual will play a critical role in advancing the operational maturity and scalability of the UDF platform and ensure our ability to incorporate new F5 product lines and features.

Key Responsibilities

Kubernetes Orchestration and Management

  • Design, deploy, and manage Kubernetes clusters and ensure efficient container orchestration to support AI workloads.
  • Implement and maintain Kubernetes-based deployment pipelines
  • Optimize resource allocation within Kubernetes clusters, while reducing costs and maximizing performance.
  • Develop and maintain high-availability and fault-tolerant Kubernetes architectures to ensure service continuity

Observability and Monitoring

  • Design and implement observability pipelines for real-time monitoring of Kubernetes clusters, including metrics collection for scaling, resource utilization, and system health.
  • Leverage tools such as Cloudwatch, DataDog, Grafana, or similar platforms to ensure visibility into Kubernetes-managed workloads
  • Establish logging, tracing, and alerting strategies to enable proactive identification and resolution of performance or reliability issues.

Automation and Scalability

  • Automate infrastructure management tasks to support the efficient deployment and operation of AI functionalities, including upgrades, scaling, and provisioning.
  • Support Infrastructure-as-Code (IaC) methodologies for the provisioning and configuration of environments, leveraging tools such as Terraform or Helm.
  • Contribute to the development of CI/CD workflows tailored for automatic scaling and effective change management practices

Collaboration and Process Improvement

  • Collaborate with product teams and sales engineering to integrate F5 products into the UDF platform and ensure effective utilization by the sales organization.
  • Support root cause analysis (RCA) processes for issues affecting the UDF platform, driving long-term corrective actions to improve system reliability.
  • Provide technical expertise to design operational workflows and procedures that improve the agility and stability of the UDF platform.
Required Qualifications
  • Education: Bachelor’s degree in Computer Science, Software Engineering, or a related technical field (or equivalent experience).
  • Experience:
    • 4+ years of experience in Site Reliability Engineering (SRE), DevOps, or similar roles with a focus on container management and AWS usage.
    • Strong expertise in managing Kubernetes clusters and containerized workloads in production environments.
    • Hands-on experience deploying and managing Kubernetes environments in AWS, especially using EKS, as well as in self-hosted ecosystems such as on-premise datacenters.
    • Proficient in monitoring and observability tools, including CloudWatch, Grafana, Fluentd, DataDog, or equivalent platforms.
    • Expertise with Infrastructure-as-Code (IaC) tools such as Terraform, Helm, or CloudFormation, and CI/CD frameworks.
    • Solid understanding of networking, storage, and compute infrastructure within containerized environments.
  • Proficiency in coding and scripting languages, including Python, Go, or Bash, with focus on automation and system integration.
  • Expertise in applying security best practices to Kubernetes environments, including data protection and resource access controls.
  • Familiarity with GPU-based workloads in Kubernetes environments and optimization strategies for AI based workloads.
  • Experience with orchestrating, troubleshooting, best practices, and optimizing complex network environments in AWS and GCP VPCs.
  • Experience working with hypervisors in GCP VPCs
<>·
Preferred Qualifications
  • Certifications:
    • Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD).
    • Relevant cloud certifications, such as AWS Certified Solutions Architect or GCP Cloud Architect certifications.
  • Familiarity with advanced Kubernetes tools and techniques such as service mesh technologies (Istio, Linkerd) or Kubernetes operators for machine learning workflows.
  • Knowledge of distributed computing concepts and experience supporting large-scale AI workloads.
  • Practical experience integrating observability and monitoring into pipelines for inference engines and machine learning models.

The Job Description is intended to be a general representation of the responsibilities and requirements of the job. However, the description may not be all-inclusive, and responsibilities and requirements are subject to change.

Please note that F5 only contacts candidates through F5 email address (ending with @f5.com) or auto email notification from Workday (ending with f5.com or @myworkday.com).

Equal Employment Opportunity

It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race, religion, color, national origin, sex, sexual orientation, gender identity or expression, age, sensory, physical, or mental disability, marital status, veteran or military status, genetic information, or any other classification protected by applicable local, state, or federal laws. This policy applies to all aspects of employment, including, but not limited to, hiring, job assignment, compensation, promotion, benefits, training, discipline, and termination.  F5 offers a variety of reasonable accommodations for candidates. Requesting an accommodation is completely voluntary. F5 will assess the need for accommodations in the application process separately from those that may be needed to perform the job. Request by contacting [email protected].

F5 Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about F5 and has not been reviewed or approved by F5.

  • Equity Value & Accessibility Equity grants and an employee stock purchase plan are positioned as meaningful parts of total compensation, with RSUs and a discount ESPP commonly included. Pay packages for many technical roles are considered competitive when equity is taken into account.
  • Leave & Time Off Breadth Paid vacation that increases with tenure, sick time, paid holidays, and paid family leave are prominently featured. Additional programs like volunteer time and periodic wellness long weekends are highlighted as part of the time-off ecosystem.
  • Inclusive Benefits Coverage Health plans include travel support for specific care (such as reproductive and gender‑affirming services) and mental health resources, alongside comprehensive medical, dental, and vision coverage. These elements are presented as part of a broad, inclusive approach to healthcare.

F5 Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Seattle, WA
5,847 Employees

What We Do

F5 application services ensure that applications are always secure and perform the way they should—in any environment and on any device. F5 (NASDAQ: FFIV) powers applications from development through their entire life cycle, across any multi-cloud environment, so our customers – enterprise businesses, service providers, governments, and consumer brands—can deliver differentiated, high-performing, and secure digital experiences.

Similar Jobs

Candescent Logo Candescent

Site Reliability Engineer

Fintech • Financial Services
In-Office
Hyderabad, Telangana, IND
1030 Employees

TransUnion Logo TransUnion

Penetration Test Technical Coordinator

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
2 Locations
13000 Employees

TransUnion Logo TransUnion

Security Engineer

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Hybrid
2 Locations
13000 Employees

MassMutual India Logo MassMutual India

Project Manager

Big Data • Fintech • Information Technology • Insurance • Financial Services
In-Office
Hyderabad, Telangana, IND
10-10 Annually

Similar Companies Hiring

Fairly Even Thumbnail
Hardware • Other • Robotics • Sales • Software • Hospitality
New York, NY
30 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account