Responsibilities
- Apply DevOps practices to increase deployment speed, security, and quality.
- Architect and run CI/CD workflows in GitHub Actions (matrix builds, reusable workflows, OIDC federation).
- Design, build, and maintain Terraform/Terragrunt modules for VPCs, subnets, security groups, side-to-side VPNs, and private links.
- Manage container orchestration on ECS Fargate and Kubernetes (AWS & OCI) with Helm, Keda.
- Implement autoscaling, blue-green / canary releases, and cost-optimization for GPU and CPU workloads.
- Diagnose performance bottlenecks across network, compute, storage, and application layers.
- Maintain high-quality documentation.
Requirements
- B.S. in Computer Engineering, Information Systems, or equivalent experience.
- Strong scripting skills (Python, Bash); Go or Rust a plus.
- Hands-on CI/CD with GitHub Actions and experience running production workloads on:
- AWS: ECS Fargate, S3, RDS, CloudWatch, VPC networking.
- Kubernetes: OCI OKE, Helm, Istio, Keda.
- IaC expertise with Terraform and Terragrunt in multi-account/multi-cloud setups.
- Solid networking foundations: VPC design, subnets, routing, VPN/IPSec tunnels, security groups, load balancers.
- Observability stack experience (Grafana, Prometheus, Loki, Tempo, Datadog).
- Familiarity with container security, SBOMs, image scanning, secret management, and least-privilege IAM.
- Excellent problem-solving skills, ownership mindset, and ability to work autonomously within a distributed team.
Similar Jobs
What We Do
Tractian is a machine intelligence company that offers industrial monitoring systems. Tractian builds streamlined hardware-software solutions to give maintenance technicians and industrial decision-makers comprehensive oversight of their operations. It is democratizing access to sophisticated real-time monitoring and asset operations tools.
Tractian's solutions are used in environments that address a combined total of 5% of global industrial output. The company’s broad market reach is evidenced in its customer base from various industries, such as John Deere, Procter & Gamble, Caterpillar, Goodyear, Carrier, Johnson Controls, and Bimbo, the owner of the brands Little Bites and Thomas Bagels. Tractian's customers see a 6-12x ROI with savings of $6,000 per monitored machine annually on average.
In a major milestone and a first for the industry, Tractian launched the AI-Assisted Maintenance category in the industrial sector. In this new paradigm, artificial intelligence identifies machine problems and suggests preventive actions to be taken, giving invaluable insight and support to maintenance professionals. It is important to highlight that the intent of Assisted Maintenance is firmly rooted in augmenting maintenance professionals to provide more assertive diagnosis with human-in-the-loop feedback.
Tractian's mission is to elevate this category of workers in a highly impactful way. The Assisted Maintenance category will provide unimaginable support for maintenance professionals. By combining shop floor expertise with our technology, maintainers will be able to anticipate and address issues with unprecedented accuracy and speed









