MLOps Engineer — AI/ML Systems & Deployment (TS/SCI Preferred)
Dayton, OH (On-site Preferred) | Remote Eligible (CAC-Ready Candidates)
Mission Environment | AI/ML Infrastructure | National Security Impact
About the Role
At Rackner, we are building the operational backbone that turns AI/ML capability into real-world mission outcomes. We are seeking an MLOps Engineer to own the lifecycle of AI/ML systems—from experimentation to deployment—within a mission-critical, classified environment supporting Air Force and NASIC-aligned programs.
This is not a research role; This is where models become reliable, deployable, auditable systems.
You will operate at the intersection of:
- Machine learning
- Distributed systems
- Cloud-native infrastructure
…and ensure that AI/ML systems work in the environments where failure is not an option.
What You’ll Do
Own the ML Lifecycle (End-to-End)
- Build and operate production-grade ML pipelines
- Orchestrate workflows using Kubeflow, Airflow, or Argo
- Implement model versioning, lineage, and reproducibility standards
Operationalize AI/ML Systems
- Deploy models into mission environments (including constrained or classified systems)
- Transition workflows from Jupyter experimentation → containerized pipelines → production systems
- Enable both batch and real-time inference architectures
Engineer for Reliability, Not Just Performance
- Design systems for reproducibility, auditability, and stability
- Implement monitoring for:
- model performance & drift
- system health & latency
- Use tools like Prometheus, Grafana, and OpenTelemetry
Build Cloud-Native ML Infrastructure
- Deploy and manage Kubernetes-based ML workloads
- Containerize pipelines using Docker / OCI standards
- Scale compute for training and inference workloads
Establish Data Discipline
- Enable data versioning and governance (lakeFS or similar)
- Support feature engineering and dataset preparation pipelines
- Apply metadata standards (e.g., STAC) where applicable
Create Repeatable Systems
- Develop runbooks, playbooks, and deployment standards
- Build systems that can be operated by others; not just understood by you
What You Bring
Core Experience
- Experience deploying ML systems into production environments
- Strong background in Python and ML frameworks (PyTorch, TensorFlow, etc.)
- Hands-on experience with:
- ML pipeline orchestration tools (Kubeflow, Airflow, Argo)
- Experiment tracking (MLflow, ClearML)
Infrastructure & Systems
- Experience with Kubernetes and containerized workloads
- Familiarity with CI/CD for ML systems
- Understanding of distributed systems and scalable architectures
ML Application Exposure
- Experience working with:
- LLMs or transformer-based models
- computer vision systems (YOLO, Faster R-CNN)
- Focus on deployment and integration, not pure research
Mindset
- Systems thinker who values reliability over novelty
- Comfortable operating in ambiguous, high-stakes environments
- Able to translate experimental work into operational capability
Why This Role Matters (What You Get)
This role is a career accelerator for engineers who want to:
- Move beyond experimentation
- Own systems that actually get deployed and used
- Operate at the systems level
- Work across ML, infrastructure, and mission integration
- Build in high-trust environments
- Where correctness, auditability, and reliability matter
- Develop rare, high-demand expertise
- MLOps in constrained / classified environments is a differentiated skillset
Shape how AI is operationalized—not just built
Who We Are
Rackner is a software consultancy that builds cloud-native solutions for startups, enterprises, and the public sector. We are an energetic, growing consultancy with a passion for solving big problems across industries.
We enable digital transformation through:
- Distributed systems
- DevSecOps
- AI/ML
- Cloud-native architecture
Our approach is cloud-first, cost-effective, and outcome-driven—focused on delivering real capability, not just code.
Benefits & Perks
- 100% covered certifications & training aligned to your role
- 401(k) with 100% match up to 6%
- Highly competitive PTO
- Comprehensive Medical, Dental, Vision coverage
- Life Insurance + Short & Long-Term Disability
- Home office & equipment plan
- Industry-leading weekly pay schedule
Apply
If you’re an engineer who wants to move from building models → owning systems, we want to talk.
#MLOps #MachineLearning #Kubernetes #AIEngineering #CloudNative #DevSecOps #ArtificialIntelligence #DataEngineering #DefenseTech #NationalSecurity #AIInfrastructure #Hiring #TechCareers
Top Skills
What We Do
Rackner builds cutting-edge solutions that apply DevSecOps and the power of AI in the datacenter, public and private clouds, and edge, leveraging the future of compute capability and technologies like Kubernetes (k8s) and WebAssembly (WASM). We're a member of the Cloud Native Computing Foundation and a Kubernetes Certified Service Provider - as well as a partner to the major public cloud companies. Our customers include hypergrowth startups and federal agencies, both Civilian and Defense. Core Competencies - DevSecOps - Edge Computing - AI/ML - Cloud-Native and Hybrid-Cloud development - Web and Mobile Applications Development (Microservices)









