DevOps - Site Reliability Engineer

Posted Yesterday
Atlanta, GA
Mid level
Insurance
The Role
The DevOps - Site Reliability Engineer drives the design, deployment, and maintenance of cloud infrastructure utilizing tools like Kubernetes and Terraform. Key responsibilities include optimizing system performance, implementing CI/CD pipelines, and managing observability using platforms such as DataDog. Collaboration across teams is essential, along with staying updated on industry trends and participating in on-call rotations for system reliability.
Summary Generated by Built In


Job Description

SUMMARY OF JOB PURPOSE 

The DevOps \ Site Reliability Engineer (SRE) possesses a strong background in deploying and managing infrastructure using modern DevOps practices, with expertise in Kubernetes, Terraform, and observability and monitoring platforms such as DataDog. The DevOps \ SRE works closely with the development and Operations team to ensure the reliability, scalability, and performance of our systems.

PRIMARY JOB RESPONSIBILITIES

  • Designs, deploys, and maintains cloud infrastructure using Kubernetes and Terraform, ensuring scalability, reliability, and performance.
  • Collaborates with development teams to implement CI/CD pipelines and automate deployment processes.
  • Monitors system performance, troubleshoots issues, and implements solutions to optimize performance and ensure uptime.
  • Develops and maintains monitoring and alerting systems using observability tools such as DataDog.
  • Implements and manages microservices architectures, ensuring seamless communication and scalability.
  • Troubleshoots and resolves issues related to infrastructure, deployments, and performance, ensuring high availability and reliability of our systems.
  • Stays updated on emerging technologies and industry trends and incorporate them into our infrastructure and practices where applicable.
  • Participates in on-call rotation to address issues and incidents during weekdays, ensuring system reliability and availability.
  • Collaborates closely with all other members of the team to take shared responsibility for the overall efforts that the team has committed to for each sprint.
  • Establishes and maintains positive working relationships with other members of the organization across departments, divisions, and locations.
  • Maintains the confidentiality of proprietary and sensitive information, exercising sound judgment and discretion in any disclosure of information related to EM and its endeavors.
  • Upholds the values of Engle Martin and Our Foundation.

REQUIRED EDUCATION & EXPERIENCE

  • Bachelor’s degree in computer science, engineering, or a related field, or equivalent work experience
  • At least 3-5 years of experience in a DevOps role required with experience as a Site Reliability Engineering preferred
  • Prior experience with cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP)
  • Prior experience with observability and monitoring platforms such as DataDog, Dynatrace or Splunk
  • Certification in relevant cloud technologies preferred (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator)
  • Prior experience with Azure AKS preferred
  • Experience with other DevOps tools and technologies such as Azure DevOps, Jenkins, GitLab CI/CD, etc. preferred

DESIRED KNOWLEDGE, SKILLS & ABILITIES

  • Strong proficiency in Kubernetes and Terraform for managing and deploying infrastructure
  • Solid understanding of microservices architecture and experience in deploying and managing microservices-based systems
  • Proficiency in scripting languages such as Python, Shell, or Bash for automation tasks
  • Familiarity with Agile methodologies and practices
  • Knowledge of security best practices for cloud environments
  • Excellent problem-solving skills and ability to troubleshoot complex issues in distributed systems
  • Strong communication and collaboration skills, with the ability to work effectively across teams in a fast-paced, agile environment
  • Willingness to participate in an on-call rotation to address issues during weekdays
  • Commitment to professional and personal growth and development

WORKING CONDITIONS 

Work is conducted primarily in an indoor office environment with protection from weather conditions and with exposure to noise typical of an office or administrative setting.

PHYSICAL ACTIVITIES AND REQUIREMENTS

Lifting and carrying up to 20 lbs.; Frequent sitting, standing, walking, and bending; occasional kneeling, reaching, and stooping; handling office equipment; periodic driving may be required; visual acuity to prepare, read, and organize detailed hard copy and electronic documents; ability to speak and to hear the spoken word in normal face-to-face, web-based, and telephonic business communications. Willingness to travel in a work capacity, including occasional evening, overnight, and weekend hours. Willingness to accommodate occasional meetings and work activities that may be scheduled after normal daytime business hours.

Top Skills

Bash
Kubernetes
Python
Shell
Terraform
The Company
HQ: Atlanta, GA
604 Employees
On-site Workplace

What We Do

Engle Martin
Real relationships. Real results.

Engle Martin is a leading national independent loss adjusting and claims management provider. The firm provides a comprehensive line of service offerings including commercial property, casualty, inland marine/cargo, heavy equipment, large loss adjusting, subrogation, appraisal/umpire, specialty audits, and TPA/claims management. The company has reaffirmed its partnership with vrs Adjusters, a global organization of loss adjusting companies operating in more than 140 countries in addition to its’ latest acquisition of Synergy Adjusting Corporation which provides delegated authority and claims management services domestically and in the London Market.

For details about industries served and services offered by Engle Martin, visit EngleMartin.com.

**Engle Martin is hiring! To view all of our career opportunities, please visit:
https://corpartners.wd5.myworkdayjobs.com/EngleMartin

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Senior Site Reliability Engineer

Automotive • Cloud • Greentech • Information Technology • Other • Software • Cybersecurity
Hybrid
Atlanta, GA, USA
50000 Employees
97K-162K Annually

General Motors Logo General Motors

JR-202421799 Sr. Dev Ops Software Engineer - Commercial Software

Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Hybrid
Atlanta, GA, USA
165000 Employees
152K-233K Annually
Atlanta, GA, USA
3693 Employees

Onshore Outsourcing Logo Onshore Outsourcing

Site Reliability Engineer – Platform

Information Technology • Software • Consulting
Glennville, GA, USA
373 Employees

Similar Companies Hiring

Flume Health Thumbnail
Software • Insurance • Healthtech
US
22 Employees
Spark Advisors Thumbnail
Software • Sales • Other • Insurance • Healthtech
New York, NY
73 Employees
MassMutual India Thumbnail
Insurance • Information Technology • Fintech • Financial Services • Big Data
Hyderabad, Telangana

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account