DevOps Engineer (GCP & Kubernetes)

Reposted 18 Days Ago
6 Locations
Remote
Senior level
Information Technology • Software
The Role
As a Senior Site Reliability Engineer, you will focus on monitoring and observability across on-prem and GCP environments, improve application reliability via SRE principles, and provide L2/L3 support during incidents.
Summary Generated by Built In

We are looking for a hands-on Semi Senior DevOps Engineer to join a high-impact project supporting a global-scale sports event. This role is ideal for someone who enjoys working close to production systems, troubleshooting complex issues, automating infrastructure, and ensuring platform reliability in mission-critical environments.

You will work closely with engineering teams to build, maintain, and improve cloud-native infrastructure running on Google Cloud Platform (GCP) and Kubernetes. The role requires participation in an on-call rotation, including occasional weekend coverage.


RequirementsResponsibilities
  • Deploy, maintain, and improve cloud infrastructure in Google Cloud Platform (GCP).
  • Operate and support Kubernetes environments, including GKE.
  • Build and maintain Infrastructure as Code using Terraform.
  • Monitor production systems and proactively identify reliability risks.
  • Troubleshoot infrastructure, networking, application, and performance issues.
  • Participate in incident response, root cause analysis, and postmortem activities.
  • Implement and maintain observability solutions, dashboards, and alerting systems.
  • Collaborate with software engineering teams to improve deployment processes and operational excellence.
  • Support highly available and scalable production environments.
  • Contribute to automation initiatives that reduce operational overhead and improve reliability.
Required Qualifications
  • 3+ years of experience in DevOps, Cloud Engineering, Site Reliability Engineering, or similar roles.
  • Hands-on experience with Google Cloud Platform (GCP).
  • Strong understanding of core GCP services, including:
    • Compute Engine
    • Cloud Run
    • App Engine
    • Google Kubernetes Engine (GKE)
  • Production experience managing Kubernetes environments.
  • Experience configuring Kubernetes resources such as Deployments, Services, Ingress, ConfigMaps, Secrets, and Autoscaling.
  • Solid understanding of Kubernetes health checks, including readiness and liveness probes.
  • Experience with Infrastructure as Code using Terraform.
  • Understanding of Terraform state management and multi-environment infrastructure design.
  • Strong Linux administration and troubleshooting skills.
  • Good understanding of networking concepts, including:
    • VPCs
    • Subnets
    • Firewall rules
    • Load balancing
    • Private networking
  • Experience with monitoring, logging, and observability platforms.
  • Experience investigating and resolving production incidents.
  • Understanding of reliability concepts such as SLA, SLO, and SLI.
  • Strong verbal and written English communication skills.
Preferred Qualifications
  • Experience designing highly available and globally distributed applications in GCP.
  • Knowledge of zero-downtime deployment strategies.
  • Experience supporting large-scale production environments.
  • Experience with multi-tenant architectures.
  • Scripting experience using Python, Bash, or similar languages.
  • Experience working in hybrid cloud/on-premise environments.
  • Experience participating in SEV incident management.
  • Familiarity with capacity planning and performance tuning.
Technology Stack
  • Cloud: Google Cloud Platform (GCP)
  • Containers: Kubernetes, GKE
  • Infrastructure as Code: Terraform
  • Monitoring & Observability: Grafana, Prometheus, Logging Platforms
  • Operating Systems: Linux
  • Incident Management: PagerDuty, ServiceNow, Slack (or equivalent tools)
Working Requirements
  • Availability to work within CT business hours.
  • Participation in an on-call rotation that includes coverage for one weekend day when scheduled.
What Success Looks Like
  • Reliable operation of production systems during periods of high traffic and critical business activity.
  • Fast and effective incident response and troubleshooting.
  • Well-automated, maintainable infrastructure managed through Infrastructure as Code.
  • Strong collaboration with development teams to improve reliability, scalability, and operational efficiency.

Benefits

At Devsu, we believe in creating an environment where you can thrive both personally and professionally. By joining our team, you’ll enjoy:

  • A stable, long-term contract with opportunities for career growth
  • A remote-friendly culture that promotes work-life balance
  • Continuous training, mentorship, and learning programs to keep you at the forefront of the industry
  • Free access to AI training resources and state-of-the-art AI tools to elevate your daily work
  • A flexible Paid Time Off (PTO) policy as well as paid holiday days
  • Challenging, world-class software projects for clients in the US and LatAm
  • Collaboration with some of the most talented software engineers in Latin America and the US, in a diverse work environment

Join Devsu and discover a workplace that values your growth, supports your well-being, and empowers you to make a global impact.

Skills Required

  • Strong experience as a Site Reliability Engineer or Reliability Engineer
  • Deep hands-on expertise with Grafana
  • Solid experience with monitoring and observability systems
  • Production experience operating Kubernetes environments
  • Experience supporting systems in GCP and on-prem environments
  • Strong Linux systems and troubleshooting skills
  • Ability to work in PST time zone
  • Ability to participate in on-call rotation
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Orlando, FL
223 Employees
Year Founded: 2010

What We Do

Devsu is a trusted technology partner that delivers world-class software delivery and staff augmentation services to startups, scale-ups, and enterprise companies. With over a decade of experience in the industry, our team of seasoned professionals has the necessary knowledge, expertise, and experience to help you build, scale, and launch your next digital product. We take pride in our customer-centric approach and our commitment to delivering high-quality solutions that meet and exceed our client’s expectations. Our nearshore model allows us to offer cost-effective and flexible services tailored to your needs without sacrificing quality. At Devsu, we believe in the power of technology to transform businesses and improve people’s lives. That’s why we invest heavily in our people, processes, and IP to provide you with the best talent and cutting-edge solutions to help you achieve your goals and stay ahead of the curve. Whether you need to develop a new product from scratch, augment your existing team, or optimize your software development processes, Devsu has the expertise and team to make it happen.

Similar Jobs

Deepgram Logo Deepgram

Research Staff, LLMs

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
In-Office or Remote
49 Locations
150 Employees
150K-250K Annually

Luxury Presence Logo Luxury Presence

Design Engineer

Marketing Tech • Real Estate • Software • PropTech • SEO
Easy Apply
Remote or Hybrid
12 Locations
500 Employees

Luxury Presence Logo Luxury Presence

Staff Devops Engineer

Marketing Tech • Real Estate • Software • PropTech • SEO
Easy Apply
Remote or Hybrid
12 Locations
500 Employees

Luxury Presence Logo Luxury Presence

Staff Data Engineer

Marketing Tech • Real Estate • Software • PropTech • SEO
Easy Apply
Remote or Hybrid
12 Locations
500 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account