Senior DevOps Engineer

Reposted 10 Days Ago
New York City, NY
In-Office
160K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Natural Language Processing • Retail • Business Intelligence • Conversational AI
We're building the future of Enterprise Intelligence.
The Role
As a Senior DevOps Engineer, you'll architect reliable infrastructure for autonomous AI systems, ensuring 99.97% uptime while handling high-volume data processing and implementing secure, compliant systems for major retailers.
Summary Generated by Built In
About Merciv

Merciv is pioneering autonomous retail intelligence through EVA (Evolving Virtual Analyst), our agentic AI platform that's transforming how the world's largest retailers operate. We don't just provide insights -- we enable AI agents to actively manage critical business functions from consumer intelligence to inventory optimization to competitive analysis.

We've intentionally stayed under the radar while building something transformative. Our platform already powers retail intelligence for Fortune 500 leaders including Gap Inc. (and their portfolio brands like Old Navy, Banana Republic, and Athleta), Hain Celestial (Terra, Celestial Seasonings, etc.), and Boston Beer Company (Samuel Adams, Truly, etc.). Fresh off a $14M Series Seed, we're scaling our team of innovators as we prepare for significant growth in 2025-26.

We believe in making every feature intuitive and every component interactive, helping our users leverage the tools they recognize in a whole new light. Our lean, agile team thrives on innovation and autonomy. Here, you won't just fill a role—you'll shape the future of our product and company. We embrace a 'many hats' approach, offering unparalleled opportunities for growth and impact.

Our work environment thrives on creativity and cutting-edge technology, built for those eager to shape the future of human-AI collaboration. If you’re excited about being a part of a fast-paced, innovative startup culture and making a significant impact on client business, Merciv is the place for you. Join us as we expand the horizons of retail intelligence and deliver unparalleled value to our clients.

Current state:
  • Multiple Fortune 500 enterprise customers

  • Strong distribution partnerships driving majority of revenue

  • Platform managing millions of data points across retail ecosystems

  • Small (Sub-20) team of exceptional AI researchers, engineers, and operators

Where we're going:
  • 50,000+ users on our platform within 12 months

  • $20M+ ARR by end of 2026

  • Category leadership in autonomous retail operations

  • Series A fundraise in next 12 months

The Role

Merciv is transforming retail through autonomous AI that doesn't just analyze businesses - it helps run them. Our platform powers some of the largest retail organizations in the world, processing data and managing workflows across millions of datapoints. We're building the infrastructure that enables AI agents to inform consumer-facing strategies, make million-dollar inventory decisions, optimize pricing in real-time, and orchestrate complex retail operations 24/7.

As we scale our enterprise platform, we need a Senior DevOps Engineer who can build and maintain the rock-solid, secure infrastructure that autonomous commerce demands. This is your chance to architect the systems powering the future of retail AI.

You'll own the infrastructure that supports AI agents making split-second decisions for major retailers. Working closely with our ML and backend engineers, you'll ensure our platform maintains 99.97%+ uptime while handling Black Friday-level traffic every day. This is a hands-on role where you'll build the secure, scalable systems that enterprise retailers trust with their entire operations.

In this role, you'll be the guardian of infrastructure that must be enterprise-grade (SOC 2, GDPR, ISO 27001 compliant) while maintaining startup agility. Your work directly impacts whether a retailer's AI can respond to market changes in milliseconds or minutes - a difference measured in millions of dollars.

What You'll Do
  • Scale AI Infrastructure: Architect and optimize infrastructure supporting high-volume daily agentic decisions

  • Ensure Enterprise Reliability: Build systems that maintain 99.97% uptime for mission-critical retail operations across Fortune 500 clients

  • Automate Everything: Develop robust CI/CD pipelines for rapid ML model deployment and infrastructure updates without downtime

  • Secure Sensitive Data: Implement and maintain SOC 2, GDPR, and ISO 27001 compliant infrastructure for enterprise retail data

  • Optimize AI/ML Workflows: Partner with engineers to streamline model training, deployment, and inference pipelines at scale

  • Champion GitOps: Implement infrastructure-as-code practices that let us scale from hundreds to thousands of AI agents seamlessly

  • Monitor Autonomous Systems: Build observability into distributed agent networks processing millions of retail data points

  • Enable Multi-Tenancy: Design secure, isolated environments for enterprise clients while maintaining operational efficiency

  • Integrate Enterprise Systems: Support seamless connections with Shopify Plus, SAP, Oracle Retail, and other major platforms

  • Own Production Excellence: Lead incident response for a platform where minutes of downtime could mean millions in lost revenue

Core RequirementsExperience
  • 6-10+ years of industry experience with at least 4 years in hands-on DevOps roles

  • 4+ years managing cloud infrastructure in production (AWS strongly preferred)

  • 2+ years of production Kubernetes experience (EKS preferred)

Technical Skills

Cloud & Infrastructure

  • Expert-level AWS knowledge (EC2, EKS, Lambda, S3, RDS, IAM, Secrets Manager, KMS)

  • Advanced Infrastructure-as-Code expertise with Terraform and Terragrunt

  • Strong GitOps experience and configuration management (Ansible)

  • Hands-on experience with bare metal configuration and machine templates

Containers & Orchestration

  • Advanced Docker knowledge and container debugging skills

  • Production Kubernetes with Helm, FluxCD, and KEDA

  • Container-based deployment strategies (blue-green, canary, rolling)

Programming & Automation

  • Required: Strong Python and Bash scripting for automation and CLI tool development

  • CI/CD pipeline design with GitHub Actions and other platforms

  • Ability to write robust, production-ready automation

Monitoring & Reliability

  • Experience with observability stacks (NewRelic preferred, CloudWatch, Prometheus/InfluxDB)

  • Distributed tracing, log aggregation, and alerting strategies

  • Root cause analysis and post-mortem expertise

Security & Networking

  • Deep understanding of network security, load balancing, and DNS

  • IAM best practices, key management, and secret rotation

  • Compliance experience (SOC2, GDPR) and zero-trust architecture principles

  • Threat modeling capabilities with a proactive security mindset

Systems

  • Solid Linux administration and system debugging skills

  • Strong networking fundamentals and troubleshooting abilities

The Infrastructure Challenge

At Merciv, you'll tackle unique challenges at the intersection of AI, real-time systems, and enterprise retail:

  • Extreme Scale: Support distributed AI agents making high-volume decisions daily across consumer insights, inventory, pricing, and promotions

  • Ultra-Low Latency: Maintain sub-30ms decision latency for real-time retail operations

  • Multi-Tenant Architecture: Securely isolate data and compute for competing retailers on the same platform

  • ML Operations: Enable rapid iteration of AI models while ensuring zero-downtime deployments

  • Enterprise Integration: Seamlessly connect with legacy retail systems and modern cloud platforms

  • Compliance at Speed: Maintain strict regulatory compliance without sacrificing deployment velocity

Nice-to-Have Skills
  • Backend or full-stack development experience

  • AI/ML infrastructure experience (model serving, GPU clusters, training pipelines)

  • Experience with real-time, high-throughput data systems

  • Multi-tenant SaaS platform expertise

  • Retail or e-commerce domain knowledge

  • eBPF for advanced observability

  • Experience with Terraform Cloud at scale

  • Service mesh technologies

  • Multi-region deployment expertise for global retail operations

  • SecOps experience at enterprise scale

  • Experience with event-driven architectures

  • Knowledge of streaming platforms (Kafka, Kinesis)

What We're Looking For
  • Ownership Mentality: Track record of owning critical infrastructure outcomes

  • Problem Solver: Strong debugging skills with the ability to work through ambiguous problems

  • Security-First Mindset: Proactive approach to identifying and mitigating threats

  • Clear Communicator: Excellent written and verbal communication, comfortable with synch and async work

  • Documentation Champion: Creates clear runbooks, architecture diagrams, and knowledge bases

  • Collaborative Spirit: Motivated by helping others succeed and working cross-functionally

  • Strategic Thinker: Ability to balance immediate needs with long-term infrastructure vision

  • Growth Mindset: Continuous learner who stays current with DevOps best practices

Why Join Us?
  • Revolutionary Impact: Build infrastructure for AI that's literally impacting retailers’ bottom lines autonomously

  • Cutting-Edge Stack: Work with the latest in AI/ML infrastructure, distributed systems, and cloud-native architectures

  • Enterprise Trust: Your work enables Fortune 500 retailers to trust AI with million-dollar decisions

  • Rapid Growth: Join us as we expand from retail into new verticals, scaling our platform globally

  • Technical Excellence: Collaborate with world-class engineers building autonomous AI agents that are redefining commerce

  • Ownership & Equity: Significant equity participation in a company transforming a $30 trillion industry

  • Innovation Freedom: Shape the infrastructure strategy for a platform processing billions in retail transactions

  • Customer Impact: See your work directly impact major brands

  • Professional Growth: Budget for conferences, certifications, and staying at the forefront of DevOps and AI infrastructure

Compensation

Compensation Range: $160k - $220k

Benefits:

  • Health

  • Dental

  • Vision

  • Life

  • Commuter

Interview Process

We respect your time and aim to complete our interview process within 2-4 weeks:

  1. Technical screening call (45-60 minutes)

  2. Technical deep dives covering infrastructure, architecture, and hands-on coding

  3. Team collaboration session (preferably in-person)

  4. Culture & vision discussion with leadership

Merciv is building the future of autonomous commerce. We're committed to assembling a diverse team of builders who want to revolutionize how the world does business.

We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.


Top Skills

Ansible
AWS
Bash
Cloudwatch
Docker
Github Actions
Influxdb
Kubernetes
Newrelic
Prometheus
Python
Terraform
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
7 Employees
Year Founded: 2022

What We Do

Merciv provides enterprises with an end-to-end intelligence platform that guides data through a complete journey—from raw information to clear insights, organizational knowledge, and predictive foresight. By seamlessly integrating vast data resources, offering natural language interaction, and delivering proactive, actionable intelligence, Merciv empowers organizations to unlock hidden opportunities, optimize operations, and shape their future in an increasingly complex, data-driven world.

Similar Jobs

NBCUniversal Logo NBCUniversal

Senior Devops Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote or Hybrid
New York, NY, USA
68000 Employees
130K-150K Annually

Sydecar Logo Sydecar

Senior Devops Engineer

Angel or VC Firm • Fintech • Software
Hybrid
2 Locations
65 Employees
170K-200K Annually
In-Office
Manhattan, New York, NY, USA
638 Employees

IntelliPro Group Inc. Logo IntelliPro Group Inc.

Senior Devops Engineer

HR Tech • Information Technology
In-Office
New York, NY, USA
638 Employees
40-75 Hourly

Similar Companies Hiring

Credal.ai Thumbnail
Software • Security • Productivity • Machine Learning • Artificial Intelligence
Brooklyn, NY
Standard Template Labs Thumbnail
Software • Information Technology • Artificial Intelligence
New York, NY
10 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account