Infrastructure Software Engineer, Core Platform

Posted Yesterday
Be an Early Applicant
2 Locations
In-Office
Mid level
Artificial Intelligence • Machine Learning • Software • Defense
The Role
Own infrastructure reliability, scalability, and developer experience: author IaC (OpenTofu/Terraform), design/manage Kubernetes and Helm charts, architect CI/CD, implement observability (Grafana/Prometheus/OpenTelemetry), manage secrets and production networking, optimize container builds, and maintain developer tooling across environments.
Summary Generated by Built In
About the Role

Lumbra is building Nebula, an agentic harness running as a set of microservices on managed Kubernetes, backed by managed databases, caching, and workflow orchestration, all provisioned with OpenTofu and deployed with Helm via CI/CD. We currently run on GCP but are not wed to any single provider. We're looking for an infrastructure engineer to own the reliability, scalability, and developer experience of the harness across dev, demo, and production environments.

What You'll Own
  • Author and maintain Infrastructure as Code (OpenTofu/Terraform) modules for cloud resources including networking, managed Kubernetes clusters, databases, caching, and container registries. Strong IaC skills and experience with GCP (or equivalent) are essential.

  • Design and manage Kubernetes cluster configurations including node pool autoscaling, workload identity, private connectivity for database access, and network policies. You need deep Kubernetes knowledge, not just manifest authoring.

  • Build and optimize Helm charts for a shared service template consumed by multiple services, managing environment-specific overrides across dev, demo, staging, and production. Experience with Helm inheritance patterns and chart libraries is important.

  • Own the CI/CD pipeline architecture: multi-stage builds, conditional triggers based on file-change detection, and deployment orchestration. You should be comfortable authoring and debugging complex pipeline configurations.

  • Implement and maintain the observability stack (metrics, traces, logs) across all services using Grafana, Prometheus, and OpenTelemetry. Experience instrumenting distributed systems and building actionable dashboards is needed.

  • Manage secrets lifecycle and credential rotation with automated syncing to Kubernetes, plus identity provider configuration. Understanding of zero-trust patterns and secrets management at scale is essential.

  • Configure and maintain production networking including load balancing, TLS termination, DNS, and authentication proxies. Solid networking fundamentals are a must.

  • Optimize the container build pipeline for speed and security: multi-stage builds, layer caching, image hardening, and size reduction for faster, safer deployments.

  • Continuously profile and optimize platform performance: query latency, pod startup times, resource utilization, and network throughput. You care about measurable improvements and treat sluggish infrastructure as a bug, not a tradeoff.

  • Maintain developer experience tooling including local development environments, task automation, and environment bootstrapping that lets engineers go from clone to running system quickly.

Preferred Qualifications
  • Experience operating Temporal or similar workflow orchestration systems in production

  • Familiarity with graph databases (Neo4j) and object storage (MinIO, S3-compatible) on Kubernetes

  • Experience with Keycloak or similar identity providers: administration, realm configuration, and OIDC management

  • Background in cloud cost optimization: committed use discounts, node pool right-sizing, spot instances

  • Familiarity with GitOps patterns (ArgoCD, Flux) as an evolution from push-based CI/CD

  • Understanding of public key infrastructure: certificate management, mTLS, CA hierarchies, and trust chain validation

Experience with hybrid networking between cloud and on-prem environments

Benefits
  • Comprehensive medical, dental, and vision plans

  • Premiums 100% covered by Lumbra for all employees

  • Exceptionally low premiums for spouses and dependents

  • Basic life insurance and disability 100% covered for all employees by Lumbra

  • Option to purchase additional life insurance available

  • Take the time off that you need, when you need it' paid time off, not accrual based

  • Generous company holiday calendar including a holiday shutdown in December

  • Supportive leave of absence program including time off for military service, medical events, and parental leave

  • Full 401(k) retirement plan for all full-time eligible employees

  • Company-funded commuter benefits

  • Free access to on-site gym at office

Skills Required

  • Infrastructure as Code (OpenTofu/Terraform) module authoring for cloud resources
  • Experience with GCP or equivalent cloud provider
  • Deep Kubernetes knowledge including cluster config, node pool autoscaling, workload identity, and network policies
  • Helm chart authoring and managing environment-specific overrides
  • Designing and maintaining CI/CD pipeline architecture with multi-stage builds and conditional triggers
  • Observability stack implementation and instrumentation using Grafana, Prometheus, and OpenTelemetry
  • Secrets lifecycle and credential rotation with automated syncing to Kubernetes and identity provider configuration
  • Production networking configuration: load balancing, TLS termination, DNS, and authentication proxies
  • Container build pipeline optimization: multi-stage builds, layer caching, image hardening, size reduction
  • Experience operating Temporal or similar workflow systems in production
  • Familiarity with graph databases (Neo4j) and object storage (MinIO, S3-compatible) on Kubernetes
  • Experience with Keycloak or similar identity providers (OIDC, realm configuration)
  • Cloud cost optimization experience (committed use, right-sizing, spot instances)
  • Familiarity with GitOps patterns and tools (ArgoCD, Flux)
  • Understanding of public key infrastructure, certificate management, and mTLS
  • Experience with hybrid networking between cloud and on-prem environments
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
17 Employees

What We Do

Lumbra is an AI company building the architecture for autonomous intelligence in the intelligence community. They are developing frameworks, orchestration layers, and agentic operating systems, including Nebula—an agentic harness designed to make AI agents reliable, evaluable, and useful for real analytical work in high-consequence environments. The team consists of engineers and operators from the intelligence community, special operations, and frontier AI research.

Similar Jobs

Cox Enterprises Logo Cox Enterprises

Communications Specialist

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Remote or Hybrid
United States
50000 Employees
61K-92K Annually

Cox Enterprises Logo Cox Enterprises

Senior Product Manager

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Hybrid
Austin, TX, USA
50000 Employees
112K-186K Annually

Cox Enterprises Logo Cox Enterprises

Software Engineer

Artificial Intelligence • Automotive • Greentech • Information Technology • Machine Learning • Software • Cybersecurity
Hybrid
Austin, TX, USA
50000 Employees
98K-148K Annually

SharkNinja Logo SharkNinja

Director, Merchandising

Beauty • Robotics • Design • Appliances • Manufacturing
Remote or Hybrid
United States
4000 Employees
172K-235K Annually

Similar Companies Hiring

Outpost Space Thumbnail
Aerospace • Defense
US
24 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account