Senior Cloud Engineer

Reposted 5 Days Ago
Warren, NJ, USA
In-Office
Senior level
Information Technology • Consulting
The Role
Lead design, implementation, and operations of large-scale AWS HPC platforms. Provide architecture oversight, manage an onsite team and offshore engineers, drive automation (IaC/CI-CD), ensure performance, reliability, and cost optimization, and engage stakeholders across science and IT.
Summary Generated by Built In

The Senior Cloud Engineer  is responsible for designing, implementing, and managing large-scale high-performance computing (HPC) platforms on AWS for scientific and research-driven workloads. This role provides architecture oversight, technical leadership, and hands‑on engineering support, while ensuring operational excellence and strategic alignment with enterprise goals.

As the cloud engineer, you will guide a distributed engineering team, collaborate with scientific and IT stakeholders, and introduce emerging technologies that enhance HPC scalability, reliability, and automation. The role requires deep expertise in AWS HPC architecture, strong leadership capabilities, and the ability to operate autonomously in complex environments.


Requirements

AWS HPC Architecture & Engineering

  • Architect scalable HPC solutions using AWS ParallelCluster, AWS Batch, EC2 Spot, Auto Scaling, and other core AWS services.
  • Design computing environments supporting computational chemistry, molecular dynamics, genomics, and high‑throughput scientific workloads.
  • Develop and optimize data storage solutions using S3, EFS, FSx for Lustre, and high-performance data access patterns.

Engineering Leadership & Delivery Management

  • Drive the planning, execution, integration, and operationalization of HPC projects.
  • Ensure delivery quality, cost efficiency, and alignment with enterprise cloud strategy.

Operations & Platform Management

  • Oversee cluster operations, job scheduling (Slurm), compute scaling, patching, and incident response.
  • Implement best practices for observability using CloudWatch, Prometheus, Grafana, and logging frameworks.
  • Ensure high availability, reliability, and performance of HPC workloads and underlying infrastructure.

Automation & CI/CD

  • Build infrastructure using Terraform and CloudFormation with fully automated IaC delivery pipelines.
  • Deploy cloud-native automation for cluster lifecycle management, cost optimization, and environment provisioning.

Innovation & Emerging Technologies

  • Lead PoCs to evaluate new HPC frameworks, containerization strategies (Docker, Singularity), and workflow engines (Nextflow, Cromwell).
  • Recommend architectural enhancements and modernization approaches.

Stakeholder Engagement

  • Communicate architecture, progress, risks, and recommendations to technical and non‑technical stakeholders.
  • Collaborate with scientific computing, research, security, and enterprise architecture teams.
  • Act as a trusted advisor to business partners on HPC and cloud-enabled computing.

Technical Skills & Competencies

Cloud & HPC Expertise

  • Deep experience designing HPC systems on AWS (ParallelCluster, Batch, EC2 Spot, FSx for Lustre).
  • Strong Linux administration and troubleshooting skills.
  • Expertise in parallel computing technologies (MPI, OpenMP), job schedulers (Slurm), and distributed systems.

Automation & DevOps

  • Strong Terraform and CloudFormation experience
  • Hands-on CI/CD experience (GitHub Actions, GitLab CI, Jenkins)
  • Experience managing multi-account AWS org structures

Performance & Optimization

  • Skilled in tuning compute, storage, and network performance for HPC workloads.
  • Strong knowledge of cost optimization strategies for large cluster deployments.

Communication & Collaboration

  • Ability to simplify complex HPC/cloud architectures for broader audiences.
  • Strong cross-functional influence and stakeholder management skills.

Leadership Expectations

  • Able to work across regions, functions, and cultures.
  • Excellent written and verbal communication skills.
  • Demonstrates a mindset of diversity, inclusion, and continuous learning.
  • Inspires collaboration, accountability, and innovation.

Decision-Making & Autonomy

  • Makes high-impact architectural and operational decisions independently.
  • Incorporates diverse stakeholder input to develop robust solutions.
  • Drives rapid and high-quality implementation of technical strategies.
  • Accountable for architecture governance, delivery quality, and risk mitigation.

Interaction & Influence

  • Represents HPC function in customer meetings, architecture committees, and design reviews.
  • Builds strong partnerships with internal teams, affiliates, and external vendors.
  • Navigates change effectively and supports organizational transformation efforts.

Innovation

  • Challenges legacy designs and introduces new technologies for performance, automation, and cost efficiency.
  • Identifies emerging trends in cloud HPC and applies them to business needs.
  • Continuously seeks opportunities to enhance reliability, scalability, and scientific throughput.

Complexity

  • Operates within a high-complexity global environment with diverse scientific and cloud requirements.
  • Requires deep subject matter expertise and the ability to consider enterprise‑wide impacts.

Education & Qualifications

  • Bachelor’s degree in computer science, Computational Science, Engineering, or related field (required)
  • Master’s Degree (preferred)
  • Preferred certifications:
    • AWS Solutions Architect – Professional
    • AWS Advanced Networking / Data Engineering Specialty
    • Linux or HPC-specific certifications (optional)

Experience Requirements

  • 3+ years in cloud engineering, HPC operations, or scientific computing
  • 2+ years architecting HPC workloads on AWS
  • Experience with scientific research environments is a plus (chemistry, biology, genomics, material science)


Benefits

About Zifo:

CURIOSITY DRIVEN, SCIENCE FOCUSED, EMPLOYEE BUILT. Our culture is unlike any other, one where we debate, challenge ourselves, and interact with all alike. We are a curious bunch, characterized by our passion to learn and spirit of teamwork. Zifo is a global R&D solutions provider focused on the industries of Pharma, Biotech, Manufacturing QC, Medical Devices, specialty chemicals and other research-based organizations. Our team’s knowledge of science and expertise in technology help Zifo better serve our customers around the globe, including 18 of the Top 20 Biopharma companies.

We look for Science – Biotechnology, Pharmaceutical Technology, Biomedical Engineering, Microbiology etc. We possess scientific and technical knowledge and bear professional and personal goals. While we have a “no doors” policy to promote free access within, we do have a tough door to walk in. We search with a two-point agenda – technical competency and cultural adaptability.

We offer a competitive compensation package including accrued vacation, medical, dental, vision, 401k with company matching, life insurance, and flexible spending accounts.

If you share these sentiments and are prepared for the atypical, then Zifo is your calling!

Zifo is an equal opportunity employer, and we value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Skills Required

  • Bachelor's degree in Computer Science, Computational Science, Engineering, or related field
  • 8-10+ years in cloud engineering, HPC operations, or scientific computing
  • 5+ years architecting HPC workloads on AWS
  • 3+ years leading distributed teams
  • Design and implement AWS HPC architectures using ParallelCluster, Batch, EC2 Spot, Auto Scaling
  • Design high-performance data/storage solutions using S3, EFS, FSx for Lustre
  • Operate and manage job scheduling and cluster operations using Slurm
  • Strong Linux administration and troubleshooting skills
  • Expertise in parallel computing technologies (MPI, OpenMP)
  • Infrastructure as Code experience with Terraform and CloudFormation
  • Hands-on CI/CD experience (GitHub Actions, GitLab CI, Jenkins)
  • Observability and monitoring using CloudWatch, Prometheus, Grafana
  • Experience managing multi-account AWS organization structures
  • Experience with containerization for HPC (Docker, Singularity) and workflow engines (Nextflow, Cromwell)
  • Proven ability to lead onsite/offshore teams, mentor engineers, and guide architecture decisions
  • Skilled in tuning compute, storage, and network performance and cost optimization for large clusters
  • Master's Degree
  • Preferred certifications: AWS Solutions Architect - Professional; AWS Advanced Networking / Data Engineering Specialty; Linux/HPC certifications
  • Experience with scientific research environments (chemistry, biology, genomics, materials)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Chicago, IL
1,260 Employees
Year Founded: 2008

What We Do

Did you know that the saying ‘Curiosity killed the cat’ actually ends with ‘but satisfaction brought it back’? We suspect that the shortened version is the doing of grown-ups suppressing the innately curious child inside them! For us at Zifo, we take curiosity as unbounded - it is a celebrated value. Curiosity does wonderful things - drive exploration, accelerate learning, inspire better solutions, lead to innovation, drive conversations. We believe that the scientific community’s most important duty to the world is to stay curious. And this is what we do at Zifo. We have a global presence of over 1000 employees spread across our offices in US, UK, France, Germany, Switzerland, Japan, China, Canada, Singapore & India. We offer services in areas such as Scientific Informatics, Lab Informatics, Clinical Biometrics, Regulatory Compliance including Computer System Validation and Genome Informatics across 20+ countries. Our customers include 7 of the Top 10 global Bio-pharma companies. We are listed as 'Technology Fast 50'​ by Deloitte for 9 consecutive years (2012-2020) and as one of the 'Best Places to Work'​ for the past five years. We strive to stay curious, day in and day out. Asking the right questions, and listening to provide the right solutions. Write to us at [email protected].

Similar Jobs

Silverfort Logo Silverfort

Senior Sales Engineer

Information Technology • Sales • Security • Cybersecurity • Automation
Remote or Hybrid
United States
507 Employees
In-Office
3 Locations
40384 Employees
138K-183K Annually

CoreWeave Logo CoreWeave

Senior Security Engineer

Cloud • Information Technology • Machine Learning
In-Office
4 Locations
1450 Employees
165K-242K Annually
In-Office or Remote
50 Locations
17989 Employees
94K-151K Annually

Similar Companies Hiring

Amplify Platform Thumbnail
Fintech • Financial Services • Consulting • Cloud • Business Intelligence • Big Data Analytics
Scottsdale, AZ
62 Employees
Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account