Machine Learning / AI Operations Engineer

Reposted 17 Days Ago
Francisco, IN
In-Office
Senior level
Artificial Intelligence • Software
The Role
As a Machine Learning/AI Operations Engineer, you will build, deploy, and maintain AI models, ensuring reliability and performance for enterprise-level applications, while shaping the team's roadmap and leveraging AI to enhance productivity.
Summary Generated by Built In

Everyone's talking about AI. But here's the truth: ChatGPT can't send your emails. It can't book your flights. It can't even order you lunch.

Why? Because AI is trapped in a chat box. It can't take real actions in the real world.

We are changing that forever. We're not just building another AI company - we're creating the infrastructure that will power every AI application you'll use in the future.

The Revolution Needs You

Every AI app needs agentic "tools" - special functions that let AI models take real actions. Without tools, AI can only chat. With tools, AI can actually do things. We're building the definitive tools catalog and tool-calling platform that will unlock AI's true potential. Think Zapier for AI Actions. Think Auth0 for AI. Think really big.

Why This Is The Opportunity of a Lifetime

  • Founder-Market Fit : Our CEO previously founded Stormpath (acquired by Okta), where he created the first Authentication API for developers. He's done this before - and this time the market is 10x bigger. Our CTO led the vector database team at Redis, shipped 100+ LLM applications, and is a contributor to LangChain and LlamaIndex. He knows this space better than anyone.

  • Dream Team: We've assembled authentication, integrations, distributed systems, and AI experts from Okta, Redis, Microsoft, Splunk, Ngrok, Google, Airbyte, Disney, Snowflake, and HPE who've built and founded multiple successful developer platforms.

  • Perfect Timing: We're at the inflection point of AI adoption. The biggest problem isn't better models - it's connecting AI to real-world actions. That's us.

  • Massive Market: We're building critical infrastructure for the biggest technological shift of our generation. Every AI app will need what we're building.

  • Backed By The Best: Our investors have backed Databricks, Clickhouse, MongoDB, Perplexity, Cohere, ScaleAI, Confluent, Elastic, and Firebase. They see what we see - this is going to be huge.

The Challenge

As our first Machine Learning/AI Operations Engineer, you will be responsible for building and maintaining the models and infrastructure that power Arcade's agentic features. We are building state-of-the art features to make everyone's agents more powerful, including agent tools, memory & context management, and related products. For certain tasks, building custom models is the way to deliver the performance and accuracy that customers are asking for, but hasn't been possible until now. We need your help ensuring that these features work reliably and quickly within our cloud and on-premise for our enterprise customers.

What You'll Do

  • Build: Create bleeding-edge models fine-tuned for Arcade's agentic products.

  • Deploy: Test and deploy our models and related application software, both on-prem and in our cloud.

  • Monitor: Build the systems to keep our models running. Make the models and APIs better. Collect the data you need to do it.

  • Build our stack. Use your experience to chose the right tools for the job, balancing speed, maintainability and cost.

  • Shape the roadmap for the team

  • Build leverage (via AI) - projects that take a week today should take a day next time.

  • Share your work with our customers and community, building our (and your) brand.

Required Skills

  • An insatiable desire to ship with your team.

  • Strong understanding of the state of the art in machine learning, especially LLMs and tool-calling (e.g. MCP).

  • Comfortable with fine tuning libraries and techniques like QLoRA/LoRA and quantization.

  • Familiarity with model lifecycle management tools (MLflow, Weights & Biases, DVC, etc)

  • Experience with model optimization, quantization and deployment formats (ONNX, TensortRT, etc)

  • Experience with modern monitoring tools (Prometheus, Grafana, Datadog, Arize AI, etc.)

  • Production experience with at least one major agent framework (Langchain, LlamaIndex, OpenAI Agents SDK, Mastra, etc)

  • 5+ years of software engineering experience comprising of:

    • 3+ years experience working on a production level ML training or inference system

    • 2+ years of backend development experience with either Python or Go

    • 2+ years of experience with infrastructure and deployment (AWS/GCP, Terraform, Helm, etc)

  • Strong experience building and maintaining production APIs.

  • Track record of writing clean, well-documented, well-tested code.

  • User-centered approach to designing developer-centric products and tools.

Bonus Points

  • Open-source contributions

  • Experience with high-scale distributed systems

  • You’ve been a startup founder or early-stage startup employee before and love it.

Join The Movement

We're not just building a product - we're leading a movement to transform AI from just chatbots to agents that can take actions against real systems. This is your chance to be at the forefront of that revolution.

If you want to look back in 5 years and say, "I helped build that", then we want to talk to you.

Ready to make AI actually useful? Apply Now

Top Skills

Arize Ai
AWS
Datadog
Dvc
Elk
GCP
Go
Grafana
Helm
Langchain
Llamaindex
Mlflow
Onnx
Openai Agents Sdk
Openvino
Prometheus
Python
PyTorch
Scikit-Learn
TensorFlow
Tensortrt
Terraform
Weights & Biases
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
17 Employees
Year Founded: 2024

What We Do

Arcade is an AI Tool-calling Platform. For the first time, AI can securely act on behalf of users through Arcade's authenticated integrations, or "tools" in AI lingo. Connect AI to email, files, calendars, and APIs to build assistants that don't just chat – they get work done. Start building in minutes with our pre-built connectors or custom SDK.

Similar Jobs

ZS Logo ZS

Technology Manager - Financial Services

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Hybrid
5 Locations
13000 Employees
190K-217K Annually

CrowdStrike Logo CrowdStrike

Sales Engineer

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Remote or Hybrid
5 Locations
10000 Employees
100K-140K Annually

Liberty Mutual Insurance Logo Liberty Mutual Insurance

Senior Software Engineer

Artificial Intelligence • Fintech • Insurance • Marketing Tech • Software • Analytics
Hybrid
Indianapolis, IN, USA
40000 Employees
104K-197K Annually

Kraft Heinz Logo Kraft Heinz

Human Resources Business Partner

Big Data • Cloud • Food • Machine Learning • Software • Database • Analytics
Hybrid
Kendallville, IN, USA
38000 Employees
102K-128K Annually

Similar Companies Hiring

PRIMA Thumbnail
Travel • Software • Marketing Tech • Hospitality • eCommerce
US
15 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Idler Thumbnail
Artificial Intelligence
San Francisco, California
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account