Lead Cloud Infrastructure Engineer

Posted 4 Days Ago
Be an Early Applicant
Hyderabad, Telangana, IND
Hybrid
Expert/Leader
Artificial Intelligence • Big Data • Cloud • Information Technology • Machine Learning
The Role
The Lead Cloud Infrastructure Engineer will design and manage cloud infrastructures, mentor engineers, and ensure system scalability and security while engaging with clients. Responsibilities include implementing IaC, managing CI/CD pipelines, and leading incident response efforts.
Summary Generated by Built In

We are seeking a highly skilled and experienced Senior Infrastructure Engineer to join our dynamic team. The ideal candidate will be passionate about building and maintaining complex systems, with a holistic approach to architecture.You will play a key role in designing, implementing, and managing cloud infrastructure, ensuring scalability, availability, security, and optimal performance. You will also provide mentorship to other engineers, and engage with clients to understand their needs and deliver effective solutions.

Responsibilities:

    • Design, architect, and implement scalable, highly available, and secure infrastructure solutions, primarily on Google

    Cloud Platform (GCP) and/or Amazon Web Services (AWS).

    • Develop and maintain Infrastructure as Code (IaC) using Terraform for enterprise-scale maintainability and

    repeatability.

    • Utilize Kubernetes deployment tools such as Helm/Kustomize in combination with GitOps tools such as ArgoCD for

    container orchestration and management.

    • Design and implement CI/CD pipelines using platforms like GitHub, GitLab, Bitbucket, Cloud Build, Harness, etc.,

    with a focus on rolling deployments, canaries, and blue/green deployments.

    • Ensure auditability and observability of pipeline states.

    • Implement security best practices, audit, and compliance requirements within the infrastructure.

    • Provide technical mentorship and training to engineering staff.

    • Engage with clients to understand their technical and business requirements, and provide tailored solutions.

    • Troubleshoot and resolve complex infrastructure issues.

    • Lead and participate in incident response, troubleshooting, and root cause analysis for production issues.
      
    • Manage incident lifecycle activities including triage, escalation, communication, and post-incident reviews.

    • Monitor application and infrastructure health using observability platforms and monitoring tools.

    • Define and maintain SLIs, SLOs, and error budgets to improve service reliability.

Qualifications:

    • 9+ years of experience in Infrastructure Engineering or similar role.

    • Extensive experience with Google Cloud Platform (GCP) and/or Amazon Web Services (AWS).

    • Proven ability to architect for scale, availability, and high-performance workloads.

    • Deep knowledge of Infrastructure as Code (IaC) with Terraform.

    • Strong experience with Kubernetes and related tools (Helm, Kustomize, ArgoCD).

    • Solid understanding of Git, branching models, CI/CD pipelines, and deployment strategies.

    • Experience with security, audit, and compliance best practices.

    • Excellent problem-solving and analytical skills.

    • Strong communication and interpersonal skills, with the ability to engage with both technical and non-technical

    stakeholders.

    • Experience in technical mentoring, team-forming, and fostering self-organization and ownership.

    • Experience with client relationship management and project planning.

    • Strong experience with incident management processes and major incident handling.

    • Hands-on experience with observability and monitoring.

    • Knowledge of logging, metrics, tracing, and distributed observability concepts.

    • Experience defining and managing SLIs, SLOs, and alerting strategies

Certifications:

    • Relevant certifications (e.g., Kubernetes Certified Administrator, Google Cloud Certified Professional Cloud

    Architect, etc.).

    • Software development experience (e.g., Terraform, Python).

    • Experience/Exposure with machine learning infrastructure.

Education:

    • Bachelor's degree in Computer Science, a related field, or equivalent experience.

Skills Required

  • 9+ years of experience in Infrastructure Engineering or similar role
  • Extensive experience with Google Cloud Platform (GCP) and/or Amazon Web Services (AWS)
  • Deep knowledge of Infrastructure as Code (IaC) with Terraform
  • Strong experience with Kubernetes and related tools (Helm, Kustomize, ArgoCD)
  • Solid understanding of Git, branching models, CI/CD pipelines, and deployment strategies
  • Experience with security, audit, and compliance best practices
  • Relevant certifications (e.g., Kubernetes Certified Administrator, Google Cloud Certified Professional Cloud Architect)
  • Software development experience (e.g., Terraform, Python)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Naperville, IL
240 Employees
Year Founded: 2000

What We Do

Egen is a data engineering and cloud modernization firm partnering with leading Chicagoland companies to launch, scale, and modernize industry-changing technologies. We are catalysts for change who create digital breakthroughs at warp speed. Our team of cloud and data engineering experts are trusted by top clients in pursuit of the extraordinary. Our mission is to be an enabler of amazing possibilities for companies looking to use the power of cloud and data. We want to stand shoulder to shoulder with clients, as true technology partners, and make sure they succeed at what they have set out to do. We want to be disruptors, game-changers, and innovators who have played an important part in moving the world forward.

Similar Jobs

Zscaler Logo Zscaler

Development Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Easy Apply
Hybrid
3 Locations
8697 Employees
Easy Apply
In-Office
Hyderabad, Telangana, IND
900 Employees
7-7 Annually
Easy Apply
In-Office
Hyderabad, Telangana, IND
900 Employees

Optum Logo Optum

Site Reliability Engineer

Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
In-Office
Hyderabad, Telangana, IND
160000 Employees

Similar Companies Hiring

Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account