About the Role
We are a fast-growing semiconductor startup building next-generation silicon. Our design and verification pipelines rely on large-scale Linux compute infrastructure spanning AWS and on-prem environments.
We are seeking a senior, hands-on Cloud & Infrastructure IT Engineer to own the reliability, performance, and automation of our mission-critical EDA platforms. You will work directly with chip design teams to ensure our compute environments are fast, stable, secure, and ready to scale.
RequirementsWhat You’ll Do
- Operate and scale hybrid AWS + on-prem Linux compute infrastructure for chip design and verification workloads.
- Own day-to-day reliability, performance tuning, capacity planning, and incident response.
- Build and maintain AWS environments using Terraform and Ansible.
- Automate provisioning of VPCs, IAM, EC2, FSx, EBS, S3, VPNs, and security controls.
- Tune Linux systems for CPU-, memory-, and I/O-intensive EDA workloads.
- Operate and optimize grid / job scheduling platforms such as Slurm, LSF, or Grid Engine.
- Design and manage high-throughput storage solutions for simulation pipelines.
- Develop automation and self-service tooling using Python and Bash.
- Implement observability and alerting using Prometheus and Grafana.
- Participate in on-call rotation and lead root-cause analysis for production incidents.
Required Qualifications
- AWS: VPC, EC2, IAM, FSx, EBS, S3, VPN, security controls
- Infrastructure as Code: Terraform, Ansible
- Linux / HPC: Kernel, filesystem, and network performance tuning
- Schedulers: Slurm / LSF / Grid Engine
- Automation: Python, Bash
- Observability: Prometheus, Grafana
- CI/CD: GitHub Actions / GitLab CI
Requirements
- 7+ years of hands-on experience operating large-scale Linux infrastructure.
- Strong experience managing AWS production environments.
- Advanced proficiency with Terraform, Ansible, Python, and Bash.
- Deep understanding of networking, storage, and Linux internals.
- Comfortable owning business-critical systems in a fast-moving startup.
- Experience supporting semiconductor / EDA / HPC workloads.
Preferred
- Exposure to Azure or GCP.
- Experience with cloud cost optimization / FinOps.
Skills Required
- 7+ years of hands-on experience operating large-scale Linux infrastructure.
- Strong experience managing AWS production environments.
- Advanced proficiency with Terraform, Ansible, Python, and Bash.
- Deep understanding of networking, storage, and Linux internals.
- Experience supporting semiconductor / EDA / HPC workloads.
Retym Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Retym and has not been reviewed or approved by Retym.
-
Fair & Transparent Compensation — Pay is positioned as a relative bright spot compared with broader company sentiment, with compensation rated more favorably than overall sentiment in the available snapshot.
-
Strong & Reliable Incentives — Good incentives are described alongside work-life balance in one public summary, suggesting rewards may be a meaningful part of the package for some employees.
-
Wellbeing & Lifestyle Benefits — Meal support is implied by mention of Grubhub among “pros,” indicating at least some lifestyle/perk coverage may exist for certain teams or locations.
Retym Insights
What We Do
We are a semiconductor startup that has brought together a world-class team of ASIC designers, optical communications experts and seasoned investors that excel at creating disruptive technologies. Together, we are building a novel semiconductor technology that will transform the datacenter and telecommunications industries.








