MLOps
Engineer
Location:
Hybrid – Arlington, Virginia
Employment
Type: Full-time
BizFirst
is assisting our client with the hiring of an MLOps Engineer to build and
operate the infrastructure, tooling, and processes that keep machine learning
models running reliably in production. This is a foundational role in the
client’s growing AI practice, sitting at the intersection of data engineering,
platform engineering, and applied ML – where your work directly enables data
scientists and ML engineers to move faster and ship with confidence.
Our
client is a mid-market professional services organization that is actively
rethinking how it designs and executes its core business operations through
artificial intelligence and automation. The company is building a dedicated AI
capability to embed machine learning and generative AI into its most critical
internal workflows, from decision support and process automation to real-time
analytics and intelligent document processing.
What
will you do
The
ideal candidate has 4–8 years of experience in MLOps, DevOps, or platform/data
engineering, with direct experience standing up and maintaining ML
infrastructure in cloud environments. You have worked with CI/CD pipelines,
containerized ML workloads, and model registries – and you understand what it
takes to move models from a notebook to a production system that is observable,
scalable, and maintainable.
Responsibilities:
• Design,
build, and maintain end-to-end ML pipelines including data ingestion, feature
engineering, model training, evaluation, and deployment.
• Implement
and manage CI/CD workflows for ML models, ensuring consistent, automated paths
from experimentation to production.
• Own
the model registry, versioning strategy, and experiment tracking infrastructure
used across the AI team.
• Build
monitoring and alerting systems to detect model drift, data quality issues, and
performance degradation in deployed systems.
• Manage
containerized ML workloads using Docker and Kubernetes, including scheduling,
resource allocation, and cost optimization.
• Collaborate
closely with data scientists and ML engineers to understand infrastructure
needs and reduce friction in the development lifecycle.
• Evaluate
and adopt MLOps tooling (orchestration, feature stores, serving frameworks) to
mature the team’s operational practices.
• Develop
runbooks, documentation, and incident response procedures for production ML
systems.
Requirements:
US
Citizen or Permanent Resident authorized to work in the United States.
Experience:
4–8 years in MLOps, platform engineering, or a DevOps role with direct ML
workload responsibility.
Infrastructure:
Proficiency with Docker, Kubernetes, and cloud platforms (AWS SageMaker, GCP
Vertex AI, or Azure ML).
Pipelines:
Hands-on experience with orchestration tools such as Airflow, Prefect, Kubeflow
Pipelines, or similar.
ML
Tooling: Working knowledge of MLflow, Weights & Biases, or equivalent
experiment tracking and model registry platforms.
Programming:
Strong Python skills; comfort writing infrastructure-as-code (Terraform,
Pulumi, or CloudFormation).
Monitoring:
Experience building observability into production ML systems – metrics,
logging, alerting, and dashboards.
Preferred:
Experience
supporting generative AI workloads, including LLM inference infrastructure and
GPU resource management.
Familiarity
with feature stores (Feast, Tecton, or similar) and online/offline feature
serving patterns.
Background
working in a fast-moving team where data scientists and ML engineers are
primary customers.
Experience
with cost optimization strategies for large-scale cloud-based ML training and
inference.
Degree
in Computer Science, Software Engineering, or a related technical field.
Benefits:
• Family
Health Care (54% cost covered for the entire family)
• Family
Dental (54% cost covered for the entire family)
• Family
Vision (54% cost covered for the entire family)
• Flexible
Spending Account
• Performance
bonuses tied to project and delivery milestones
• Lifetime
Event Bonuses (e.g., new child, marriage)
• Profit-sharing
arrangement for any work brought into the company
• Unlimited
Leave with Approval
• 401k
– 100% employer match on first 4% invested
• $1,500
annual training and conference budget
Job
Type: Full-time, Permanent Position
Work Authorization:
US
Citizen or Permanent Resident; no active security clearance required.
Schedule:
Monday
to Friday
Work Location:
Hybrid
– Arlington, Virginia
Skills Required
- 4-8 years of experience in MLOps, DevOps, or platform engineering
- Proficiency with Docker, Kubernetes, and cloud platforms
- Hands-on experience with orchestration tools such as Airflow or Prefect
- Strong Python skills
- Experience building observability into production ML systems
What We Do
BizFirst LLC is a recruitment services provider that partners with businesses of various sizes, offering tailored staffing solutions including traditional recruitment, subscription-based services, and on-demand IT staffing.









