Lead Software Engineer II, AI Operations

Reposted 4 Days Ago
Be an Early Applicant
Hiring Remotely in Flexia, Sarthe, Loire
In-Office or Remote
150K-170K Annually
Senior level
Fintech
The Role
Lead the design, shipping, and operation of LLM applications in AI Operations, focusing on optimizing costs, building pipelines, and ensuring quality through monitoring and governance.
Summary Generated by Built In
Best Egg is a market-leading, tech-enabled financial platform helping people build financial confidence through a variety of installment lending solutions and financial health tools. We aim to help customers make smart financial decisions and stay on track, so they can be money confident no matter what life throws at them.
We offer top-tier benefits and growth opportunities in a culture built on our core values:

Put People First – We foster an inclusive, flexible, and fun workplace.
Create Clarity – Open communication drives trust and results.
Get Things Done – We focus, prioritize, and deliver with excellence.
Deliver with Heart – We lead with kindness, humility, and strong teamwork.
Listen to Our Customers – Their needs drive our innovation.
 
Barclays has entered into an agreement to acquire Best Egg with closing expected to take place in Q2 2026. This acquisition will give us the resources and capital to continue on our mission and drive our strategy forward. With an aligned culture, lower cost of funds, and increased employee growth opportunities across a global brand, we are excited about the future of the Best Egg brand under the Barclays umbrella.
We are looking for collaborative, innovative team players who like to solve problems. There will also be immense opportunities for those willing to dive in. If you're inspired by growth and want to make a real difference, Best Egg is the place for you.

 We’re proud to be an equal opportunity employer committed to building a diverse, inclusive team.

Best Egg is hiring a Lead Software Engineer II for AI Operations to design, ship, and operate production-grade LLM applications, agents, and automations across the business. You’ll own the end‑to‑end path from prototype to stable deployment—building RAG pipelines, instituting evals and guardrails, and driving cost/performance optimization. Our stack includes Python, Metaflow on Outerbounds, AWS (including Bedrock), OpenAI/ChatGPT, and Cursor; Databricks is being evaluated and available where it makes sense. Your work will accelerate delivery, reduce LLM unit costs, and improve output quality for use cases like agent assist, compliance automation, process automation, and QA—treating AI Ops as a force multiplier for the enterprise.  

Key Responsibilities

  • Build and ship LLM apps & agents: Deliver internal copilots and customer/agent-facing automations with clear SLAs, rollbacks, and observability from day one. 
  • Own RAG pipelines: Design ingestion, chunking, embeddings, indexing, hybrid search/rerank, and retrieval evaluation; track retriever quality via offline golden sets and online metrics. 
  • AWS Infrastructure & Orchestration: Design and implement scalable AWS architectures, including AWS AI features such as Bedrock, IAM, knowledge bases, secure secrets and policy enforcement, automated provisioning, and resource-usage governance as core platform capabilities.
  • Observability & SRE for AI: Add tracing, prompt/agent version lineage, eval dashboards, and regression alerts; establish golden datasets and canary tests. 
  • Guardrails & governance: Enforce PII redaction, safety filters, role-based access, audit logs, and human‑in‑the‑loop review paths to control quality and risk. 
  • CI/CD for AI artifacts: Version and deploy prompts, tools, agents, and retrieval pipelines; support blue/green and shadow deploys with automatic rollback triggers. 
  • Cost & performance: Cut run‑rate spend through caching, truncation, batching, autoscaling, and model routing; establish clear unit economics per workflow. 
  • Developer enablement: Provide templates, SDKs, and high‑quality abstractions that let product teams ship safely without bespoke plumbing; improve developer experience. 
  • Platform integration: Build primarily in Python and Metaflow (Outerbounds); deploy on AWS (Bedrock + core services) and OpenAI; use Cursor in daily workflows; help evaluate and, when appropriate, run on Databricks.
  • Production posture: Participate in on‑call, author runbooks, and remove single‑thread risk for AI services; drive reliability and resilience akin to ML Ops.  

What You’ll Need to Succeed:

  • Experience: 5–10 years of professional software engineering (or equivalent) with 2+ years building AI/LLM applications; portfolio of shipped AI projects (links to code, demos, or case studies).
  • Exploration: Demonstrated passion for relentless exploration of the latest AI models, frameworks, and tooling, ensuring constant adoption of state-of-the-art innovations in the workflow.
  • LLM product engineering: Hands‑on with some/all of OpenAI, Bedrock, Huggingface/Ollama/vLLM; MCP servers and function/tool calling, multi‑turn orchestration, streaming, and prompt/version management.
  • RAG expertise: Practical experience designing and tuning retrieval systems (chunking, embeddings, hybrid search, reranking), integration with vector database, and measuring retrieval quality.
  • Full‑stack or equivalent backend depth: Comfortable building APIs/services and simple UIs where needed; strong fundamentals in Python and modern packaging/testing.
  • DevOps & deployment: CI/CD, containers, cloud fundamentals (AWS), and runtime performance tuning; experience operating services in production.
  • Platform & orchestration: Metaflow (Outerbounds) preferred; Databricks familiarity is a plus; ability to integrate data/feature pipelines and schedule/operate flows.
  • Observability & testing for AI: Tracing and logging, expertise in tools like Datadog, Dynatrace or Grafana where relevant for AI monitoring is essential.
  • Cost, quality, and risk mindset: Comfortable optimizing latency/throughput/cost, and implementing guardrails for PII/safety/compliance. 
  • Collaboration & mentorship: Partner effectively with data scientists, analysts, and engineers; promote best practices and high‑leverage abstractions. 
  • Bonus points: Fine‑tuning or distillation experience; Kubernetes or FastAPI exposure; familiarity with Snowflake or similar warehousing for retrieval sources.  

This role sits in AI Operations and focuses on making AI safe, fast, and economical to scale—unlocking multiple use cases through one high‑leverage engineering hire. 

Please include links to your portfolio (GitHub, write‑ups, or demos) with your application.

Employee Benefits
Best Egg offers many additional benefits for our employees, including (but not limited to):
·       Pre-tax and post-tax retirement savings plans with a competitive company matching
program
·       Generous paid time-off plans including vacation, personal/sick time, paid short--
term and long-term disability leaves, paid parental leave, and paid company
holidays
·       Multiple health care plans to choose from, including dental and vision options
·       Flexible Spending Plans for Health Care, Dependent Care, and Health
Reimbursement Accounts
·       Company-paid benefits such as life insurance, wellness platforms, employee
assistance programs, and Health Advocate programs
·       Other great discounted benefits include identity theft protection, pet insurance,
fitness center reimbursements, and many more!
#LI-REMOTE

In compliance with the CCPA, Best Egg is fully committed to handling the personal information and data of employees and job applications responsibly with respect and due care. Review our CCPA Employee Policy  here 

Top Skills

AWS
Chatgpt
Cursor
Databricks
Metaflow
Openai
Python
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Wilmington, DE
498 Employees
Year Founded: 2013

What We Do

Best Egg is a consumer financial technology platform that aims to help people feel more confident about their everyday finances through a suite of products and resources. Our digital financial platform offers simple, accessible, and personalized financial solutions including personal loans, credit cards, and a financial health resource center.

Our culture and values are one of the core reasons why our customers keep returning to Best Egg. We are committed to championing a culture of inclusiveness and diversity of thought, and we focus on providing a safe, flexible, and collaborative work environment. Our associates are encouraged to engage in creative problem solving, and we promote opportunities for growth and enrichment across the organization.

If you are inspired by inspiring others, Best Egg is the place for you.

Similar Jobs

Circle (Community) Logo Circle (Community)

Head of Media

Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
Easy Apply
Remote
31 Locations
250 Employees
150K-220K Annually

Smartling Logo Smartling

Enterprise Account Executive

Artificial Intelligence • Cloud • Information Technology • Machine Learning • Natural Language Processing • Software
Easy Apply
Remote
France
107 Employees

GitLab Logo GitLab

Database Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
Remote
31 Locations
2500 Employees
158K-338K Annually

GitLab Logo GitLab

Security Engineer

Cloud • Security • Software • Cybersecurity • Automation
Easy Apply
In-Office or Remote
34 Locations
2500 Employees

Similar Companies Hiring

Rain Thumbnail
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3 • Infrastructure as a Service (IaaS)
New York, NY
100 Employees
Scotch Thumbnail
Software • Retail • Payments • Fintech • eCommerce • Artificial Intelligence • Analytics
US
25 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account