Graphics Processing Unit (GPU) Engineer - Top Secret/SCI

Reposted 22 Days Ago
Be an Early Applicant
Bethesda, MD
In-Office
Senior level
Information Technology • Software • Cybersecurity • Automation
The Role
Design and optimize GPU architectures for Linux systems, integrate with operating systems, develop applications using CUDA/OpenCL, and enhance performance.
Summary Generated by Built In

Location: Bethesda, MD

Category: Systems Engineer 

Travel Required: No

Remote Type: Onsite

Clearance: Top Secret/SCI

Sunayu, LLC is looking for a highly skilled Systems Engineer with deep expertise in operating systems, hardware, GPU, and high-speed networking.  In this role, you will design, develop, and optimize GPU clusters that power enterprise AI for the mission customers.

This is a 100% on-site position. All work must be performed at the customer site in Bethesda at the Intelligence Community Campus.


Primary Responsibilities 

  • GPU Cluster Engineering: Design, configure, and maintain GPU Clusters. Collaborate with a multidisciplinary team to define and optimize architectures, ensuring they meet performance, power efficiency, and feature requirements.
  • Operating System Integration: Work closely with AI/ML engineers to ensure smooth GPU integration with Linux-based systems. Optimize GPU drivers for compatibility, reliability, and performance. Provide regular maintenance and updates.
  • Performance Optimization: Analyze GPU performance, identify bottlenecks, and develop strategies to improve efficiency across hardware and software layers.
  • Tooling and Automation: Build and maintain debugging tools, profiling utilities, and performance analysis software for Linux environments. Leverage scripting and configuration tools such as Bash, Python, Ansible, Puppet, and Salt.
  • Compliance & Documentation: Maintain technical documentation, architectural specifications, and Linux best practices. Support ATO (Authority to Operate) and ensure compliance with federal security standards.

   

Basic Qualifications 

  • Bachelor's or higher degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field with at least 12 years of related technical experience. Additional years of experience may be considered in lieu of a degree.
  • 10+ years of relevant systems engineering experience
  • Experience in managing NVIDIA GPU data center platforms. (DGX, HGX, H200, H100, L4s).
  • Knowledge of enterprise server components (storage/network controllers, HBA, SSDs). 
  • Strong expertise with Linux distributions. (RHEL, Ubuntu, Oracle, and Rocky).
  • Excellent problem-solving skills and the ability to collaborate within a team.
  • Candidate must, at a minimum, meet DoD 8140/8570- IAT Level II certification requirements (currently Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP along with an appropriate computing environment (CE) certification). An IAT Level III certification would also be acceptable (CASP+, CCNP Security, CISA, CISSP, GCED, GCIH, CCSP).

Clearance

  • Due to the nature of the government contracts we support, US Citizenship is required.
  • TS/SCI clearance with Polygraph required or a TS/SCI and willingness to obtain a Polygraph prior to starting.

Preferred Qualifications 

  • Experience with Kubernetes cluster management and AI/ML workflow orchestration (Argo, Airflow, and Kubeflow).
  • Familiarity with GPU virtualization and cloud computing.
  • Experience with Prometheus/Grafana for monitoring.
  • Knowledge of distributed resource scheduling systems (Slurm (preferred), LSF, etc.).

Top Skills

Ansible
Bash
Cuda
Docker
Gpu-Specific Languages
Grafana
Kubernetes
Linux
Opencl
Prometheus
Puppet
Python
Salt
Slurm
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Annapolis Junction, Maryland
13 Employees
Year Founded: 2014

What We Do

Sunayu is the catalytic agent of engineering change helping company’s, both large and small mature and evolve their technological footprint to resolve their most complex and enduring obstacles.

We focus on a DevOps culture to implement and enhance infrastructure as code with specialties in cyber security and big data development. This methodology allows us to provide world class service to all of our customers. We foster a culture of innovation, hard teamwork, perseverance, balance and family.

Sunayu delivers innovative creativity using bleeding edge technology. We will re-envision and remodel your infrastructure by focusing on automation, deployment, system engineering, analytics development, configuration management, and information security.

Similar Jobs

Hybrid
7 Locations
23-31

Monte Carlo Logo Monte Carlo

Director, Strategic Sales, East

Big Data • Cloud • Software • Generative AI • Big Data Analytics
In-Office or Remote
5 Locations

BAE Systems, Inc. Logo BAE Systems, Inc.

Quality Analyst (Mid-Level)

Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
Hybrid
Lexington Park, MD, USA
66K-112K Annually
Hybrid
Randallstown, MD, USA
26-40

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account