Cloud Engineer (Platform & Infrastructure)

Reposted 5 Days Ago
Be an Early Applicant
Hiring Remotely in Hyderabad, Telangana, IND
In-Office or Remote
Senior level
Artificial Intelligence • Machine Learning • Robotics • Software
The Role
The Cloud Engineer will enhance cloud operations by improving infrastructure, security, and CI/CD pipelines, ensuring scalability and reliability for production workloads.
Summary Generated by Built In

About Us

Are you ready to build the future of supply chain? At Gather AI, we're not just creating software; we're pioneering a new era of warehouse intelligence. We've developed a groundbreaking, vision-powered platform that uses autonomous drones and existing equipment to capture real-time data, completely digitizing workflows that have historically been manual and error-prone. This means facilities operate smarter, safer, and more efficiently, ultimately redefining "on-time, in full" delivery.

If you're looking for an opportunity to contribute to truly transformative technology and make a significant impact in a vital industry, Gather AI is the place for you. We're leading the charge in the rapidly evolving robotics industry, and we invite you to join us in reshaping the global supply chain, one intelligent warehouse at a time.

About the Team

This role sits within the Backend and Platform Engineering organization. You'll work day-to-day alongside the Fullstack Engineering team, ensuring application services have the cloud infrastructure they need to scale safely and deploy reliably. You'll also partner closely with the ML Systems Engineering (Ops) team, enabling the infrastructure capabilities required for production ML pipelines, model serving, and data workloads. Cross-functionally, you'll collaborate with QA, Release Engineering, and Platform and Security stakeholders to ensure cloud environments support stable testing pipelines, access control, secrets management, and operational governance.

About the Role

We are looking for a Cloud Engineer (Platform & Infrastructure) to help mature our cloud operations into a structured and scalable platform. Our foundational infrastructure is already in place and actively supporting production workloads, but many current practices evolved organically during earlier growth stages. Rather than building from scratch, you'll evolve an existing production environment by introducing stronger operational patterns, improving deployment safety, and ensuring our infrastructure layer reliably supports increasing system scale. This role offers meaningful ownership of the infrastructure backbone supporting a platform that combines real-time application systems with machine learning workloads, and the opportunity to influence how systems are deployed, operated, and scaled as the organization grows.

What You'll Do

  • Review and rationalize current Azure and AWS environments, identifying configuration drift, security gaps, and operational inconsistencies, and establish clear configuration standards across cloud accounts
  • Introduce repeatable Infrastructure-as-Code patterns to ensure cloud resources are provisioned, versioned, and audited through automated workflows
  • Strengthen CI/CD pipelines for infrastructure and application deployment to reduce manual operations and increase release safety across both application services and ML workloads
  • Establish consistent logging, metrics, and alerting practices across infrastructure and container workloads to improve operational visibility
  • Audit and improve cloud security practices including IAM policies, secrets management, network segmentation, and operational access controls
  • Evaluate current infrastructure architecture and introduce patterns that enable workloads to operate portably across both Azure and AWS environments
  • Improve Kubernetes platform reliability by refining autoscaling policies, workload isolation, and cluster lifecycle management
  • Partner with Fullstack and ML teams to reduce infrastructure friction around environments, networking, and resource provisioning

What You'll Need

  • Bachelor's degree in Computer Science, Engineering, or a related field
  • 5+ years of experience operating production cloud infrastructure at scale
  • Deep experience with at least one major cloud provider (Azure or AWS) and working familiarity with the other
  • Hands-on experience with Kubernetes and Docker for running containerized workloads in production environments
  • Proficiency with Terraform or equivalent Infrastructure-as-Code tooling for provisioning and managing cloud infrastructure
  • Experience implementing automated deployment pipelines using tools such as GitHub Actions, GitLab CI, or similar platforms
  • Strong operational mindset with a focus on reliability, automation, and clear technical documentation

Nice to Have

  • Experience with observability tooling such as Prometheus, ELK, OpenTelemetry, or similar logging, metrics, and monitoring systems
  • Familiarity supporting ML infrastructure workloads including pipeline orchestration, model deployment, and scalable inference environments
  • Experience working in logistics, robotics-adjacent platforms, or real-time distributed systems
  • Track record of translating application requirements into secure, reliable, and operationally safe infrastructure architecture
  • Exposure to cloud cost visibility and optimization practices
  • Experience introducing infrastructure governance standards including templates, security baselines, and operational documentation

Skills Required

  • Bachelor's degree in Computer Science, Engineering, or a related field
  • 5+ years of experience operating production cloud infrastructure at scale
  • Deep experience with at least one major cloud provider (Azure or AWS)
  • Hands-on experience with Kubernetes and Docker for running containerized workloads
  • Proficiency with Terraform or equivalent Infrastructure-as-Code tooling
  • Experience implementing automated deployment pipelines using tools such as GitHub Actions, GitLab CI, or similar platforms
  • Strong operational mindset with a focus on reliability, automation, and clear technical documentation
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
Pittsburgh, PA
27 Employees
Year Founded: 2019

What We Do

The world's first software-only autonomous inventory management platform for modern warehouses.

Similar Jobs

Boomi Logo Boomi

Principal Technical Proposals Specialist

Cloud • Information Technology • Productivity • Software • Automation
Remote
India
2200 Employees
10-16 Annually

Snyk Logo Snyk

Staff Technical Success Manager

Artificial Intelligence • Cloud • Information Technology • Security • Software • Cybersecurity • Data Privacy
Remote or Hybrid
India
1000 Employees

Built In Logo Built In

Staff Engineer

Consumer Web • HR Tech
Easy Apply
Remote or Hybrid
India
100 Employees

Capco Logo Capco

Python Automation Expert

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Remote or Hybrid
India
6000 Employees

Similar Companies Hiring

Bellagent Thumbnail
Artificial Intelligence • Machine Learning • Business Intelligence • Generative AI
Chicago, IL
20 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account