Infrastructure Architect

Reposted 3 Days Ago
Be an Early Applicant
Gurugram, Haryana
In-Office
Mid level
Software
The Role
The Infrastructure Architect is responsible for designing and managing multi-cloud infrastructures, ensuring security, availability, and cost efficiency, while providing support for operational stability and disaster recovery.
Summary Generated by Built In

Role Overview

As an Infrastructure Architect, you will be responsible for designing, implementing, and operating a resilient, secure, and cost-efficient multi-cloud infrastructure across Google Cloud Platform (GCP) and Tencent Cloud.

You will work closely with engineering, DevOps, and SRE teams to define infrastructure standards, ensure high availability through Disaster Recovery, enable Unified Monitoring, and support production stability. You will also provide Level 3 support for complex infrastructure issues and continuously improve platform reliability and cost efficiency.

Key Responsibilities

1. Infrastructure Architecture & Standards

  • Design and maintain standardized infrastructure blueprints across GCP (global) and Tencent Cloud (APAC/China).
  • Implement secure cross-cloud connectivity, including Tencent CCN and GCP Interconnect, ensuring compliant data flow across regions.
  • Define and enforce IAM/CAM standards, network security, and data residency controls in line with GDPR and China MLPS 2.0 requirements.
  • Implement and maintain Infrastructure as Code (IaC) using Terraform to ensure consistency, repeatability, and auditability.
  • Review infrastructure designs and guide development teams to follow approved patterns.

2. Production Support & Troubleshooting

  • Provide Level 3 escalation support for critical infrastructure and cloud-related production issues.
  • Participate in major incident response, supporting recovery efforts across multiple regions and cloud providers.
  • Perform Root Cause Analysis (RCA) for infrastructure incidents and implement corrective architectural improvements.
  • Collaborate with SRE and DevOps teams to improve system stability and operational maturity.

3. Monitoring & Observability

  • Design and support a centralized monitoring and observability setup across GCP and Tencent Cloud.
  • Implement consistent metrics, logs, and traces using tools such as Prometheus, Grafana, Datadog, or ELK.
  • Enable OpenTelemetry for unified tracing and logging across regions.
  • Configure alerts and health checks to proactively detect infrastructure degradation.

4. Disaster Recovery & Business Continuity

  • Design and maintain DR architectures (Active-Active or Active-Passive) across regions and cloud providers.
  • Implement backup, replication, and data recovery mechanisms, including cross-cloud storage strategies.
  • Define and track RTO and RPO targets for critical systems.
  • Participate in and support DR drills and failover testing to ensure readiness.

5. Cost Optimization & Cloud Efficiency

  • Support cost optimization initiatives, including usage analysis and rightsizing of cloud resources.
  • Assist in implementing Committed Use Discounts (GCP) and prepaid or bidding models (Tencent Cloud).
  • Identify opportunities to reduce data egress and inter-cloud transfer costs.
  • Build visibility into infrastructure costs and work with teams to optimize spend without impacting reliability.

Requirements

Required Skills & Experience

  • 4-5 years of experience in cloud infrastructure, DevOps, or platform engineering roles.
  • Strong hands-on experience with GCP and working knowledge of Tencent Cloud.
  • Experience designing multi-region, highly available cloud architectures.
  • Solid expertise in Terraform and Infrastructure as Code.
  • Practical experience in incident management, troubleshooting, and RCA.
  • Understanding of cloud security, networking, and compliance requirements.
  • Experience with monitoring and observability tools.

What Success Looks Like

  • Stable, repeatable infrastructure deployments across GCP and Tencent Cloud.
  • Faster resolution of production incidents with reduced recurrence.
  • Tested and reliable disaster recovery mechanisms.
  • Improved cost visibility and optimized cloud usage.
  • Engineering teams enabled by clear infrastructure standards and patterns.

Top Skills

Datadog
Elk
Google Cloud Platform
Grafana
Opentelemetry
Prometheus
Tencent Cloud
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Oakland, CA
74 Employees
Year Founded: 2015

What We Do

Teleport allows engineers and security professionals to unify access for SSH servers, Kubernetes clusters, web applications, and databases across all environments.

Similar Jobs

Comcast Logo Comcast

Development Engineer

Digital Media • News + Entertainment
Remote or Hybrid
India
5000 Employees

Comcast Logo Comcast

Development Engineer

Digital Media • News + Entertainment
Remote or Hybrid
India
5000 Employees

MetLife Logo MetLife

Unit Manager- Technology Services

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
India
43000 Employees

MetLife Logo MetLife

Assistant Manager - Operations

Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Remote or Hybrid
India
43000 Employees

Similar Companies Hiring

Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Milestone Systems Thumbnail
Software • Security • Other • Big Data Analytics • Artificial Intelligence • Analytics
Lake Oswego, OR
1500 Employees
Fairly Even Thumbnail
Software • Sales • Robotics • Other • Hospitality • Hardware
New York, NY

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account