- Design and build the foundational infrastructure for deploying AI agents at scale, working with emerging technologies like LangChain and Agent Development Kit to create capabilities that don't yet exist in the market
- Partner directly with AI/ML teams to enable rapid deployment of agentic solutions, turning research into production systems
- Architect, deploy, and operate our Kubernetes platform across cloud (GCP/AWS) and data center environments, ensuring 99.9%+ uptime for systems processing billions of daily transactions
- Build the tooling and automation that makes complex systems simple for 100+ developers
- Create operators, microservices, and management tools that shape how engineers across DV interact with infrastructure
- Drive CI/CD evolution, building self-service capabilities that empower teams while maintaining security and reliability
- Influence product design from the ground up, ensuring new features integrate seamlessly with DV's infrastructure
- Work across multidisciplinary DevOps teams to establish patterns that scale across the organization
Required:
- 5+ years in DevOps/Platform Engineering roles with significant software development (we need engineers who write code, not just YAML)
- 3+ years owning and operating production Kubernetes platforms (cloud or on-prem)
- Strong programming skills in at least one of: Python, Go, or TypeScript - you should be comfortable building microservices and automation tooling
- Cloud platform expertise in GCP or AWS (multi-cloud experience is a plus)
- Observability mindset - you naturally think about metrics, logging, tracing, and use data to drive decisions
- Builder mentality - you're energized by creating systems that encourage automation, observability, and ease of maintenance
Nice to Have:
- Experience developing AI workloads using frameworks like LangChain, CrewAI, or Google Agent Development Kit
- Background in site reliability engineering (SRE) practices
- Contributions to open-source DevOps tooling
- Experience with Terraform, ArgoCD, or infrastructure-as-code at scale
Beyond the Resume:
- You're a strong collaborator who can translate technical concepts for non-technical stakeholders and work effectively across teams
- You have genuine curiosity about improving developer experience—you ask "how can we make this easier?" not "that's not my job"
- You bring initiative and autonomy—when you see a problem, you propose solutions
- Cloud: GCP (primary), AWS, Oracle Cloud (emerging)
- Orchestration: Kubernetes (GKE, Rancher), ArgoCD, Helm
- AI/ML: LangChain, OpenAI Agents SDK, Google ADK, n8n
- Observability: Prometheus, Grafana, Datadog
- CI/CD: GitLab CI/CD, GitHub Actions
- IaC: Terraform, Ansible
- Languages: Python, Go, Bash, TypeScript
Real Innovation: We're not just keeping the lights on—you'll work on bleeding-edge AI infrastructure that doesn't have a playbook yet. Your work directly influences how Fortune 500 companies verify billions of dollars in digital media spend.
- Hybrid flexibility (3 days in NYC office, flexible scheduling)
- Strong remote collaboration tools and practices
- Learning budget for certifications, conferences, and courses
- Access to cutting-edge AI tools (Cursor, GitHub Copilot, etc.)
Top Skills
What We Do
DV is powering the new standard of marketing performance, giving advertisers clarity and confidence in their digital investment. Built on best practices, DV solutions create value for media buyers and sellers by bringing transparency and accountability to the market, ensuring ad viewability, brand safety, fraud protection, accurate impression delivery and audience quality across campaigns to drive performance. Since 2008, DV has helped hundreds of Fortune 500 companies gain the most value out of their media spend by delivering best in class solutions across the digital ecosystem that help build a better industry.
Learn more at doubleverify.com.









