Senior Platform Engineer, Ingestion

Posted 2 Days Ago
Be an Early Applicant
Hiring Remotely in Sweden
Remote
Senior level
Information Technology • Software • Database
The Role
Own and scale high-throughput ingestion and trace-query systems, define SDK/CLI/API standards, build and maintain integrations, debug performance and distributed-system issues, run monitoring/alerting and participate in on-call incident response to ensure reliable observability and developer-facing surfaces.
Summary Generated by Built In
About Us

At LangChain, our mission is to make intelligent agents ubiquitous. We build the foundation for agent engineering in the real world, helping developers move from prototypes to production-ready AI agents that teams can rely on. We began as widely adopted open-source tools and have grown to also offer a platform for building, evaluating, deploying, and operating agents at scale.

With $125M raised at Series B from IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we’re at a stage where we’re continuing to develop new products, growth is accelerating, and all team members have meaningful impact on what we build and how we work together. LangChain is a place where your contributions can shape how this technology shows up in the real world.

Today, our platform includes LangSmith (Observability, Evaluation, Deployment, Fleet, and Sandboxes), our open source frameworks (LangChain, LangGraph, and Deep Agents), and the newly launched LangSmith Engine for autonomous agent improvement. We have 100M+ monthly open source downloads, 6,000+ active LangSmith customers, and 5 of the Fortune 10 use LangSmith in production (+ 35% of the Fortune 500 overall), including teams at Klarna, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, LinkedIn, Monday.com, Nvidia, and Bridgewater.

About the team

The LangSmith team owns and builds LangChain's core platform for observability, evaluation, and production reliability of AI systems. From tracing and annotation to run rules, evaluations, and beyond, this team owns LangSmith end-to-end. If you want to define what great AI observability looks like at production scale, this is where that work gets done.

About the roleThis role sits at the core of LangSmith: you'll own the ingestion systems, query systems, and the API, SDK, and CLI surfaces that thousands of development teams use every day. You'll work at the intersection of distributed systems and developer experience, on infrastructure that teams across the industry depend on.
What you'll do:
  • Build and scale critical systems: design and operate high-throughput, data-intensive ingestion and trace-query systems supporting LangSmith, built on SmithDB, our purpose-built database for agent observability. Build monitoring, alerting, and automated recovery so the pipeline stays resilient.

  • Set API, SDK, and CLI standards: define and enforce the standards, tooling, and CI that power SDK generation across Python, TypeScript, Go, and Java; keep our developer surfaces consistent, high-quality, and self-served across feature teams.

  • Own integrations: build new integrations and maintain existing ones so it's easy to use LangSmith with any AI framework, agent, or tool — keeping us framework-agnostic

  • Solve complex problems: debug performance bottlenecks, optimize database queries, and architect solutions for distributed-system challenges

  • Respond to incidents: participate in an on-call rotation focused on post-incident learning, automation, and prevention

How to be successful in this role:

Many of these will apply to you — we don't expect every box checked.

  • Platform engineering: hands-on experience designing and running data-intensive systems at scale

  • Developer experience: a track record of building high-quality, widely-adopted CLIs, SDKs, or API standards that developers actually enjoy using

  • Database expertise: production experience with OSS datastores (PostgreSQL, Redis)

  • Backend languages: Strong backend software engineering skills with production-level experience in Go, Python, or TypeScript.

  • Infrastructure expertise: solid knowledge of cloud object storage, Kubernetes, containerized infrastructure, and cloud platforms (GCP, AWS)

  • Observability mastery: hands-on experience with observability stacks (Datadog, Prometheus/Grafana, OpenTelemetry, or similar)

  • Operational mindset and high agency: "you build it, you run it, you own it," with a focus on sustainable practices

Nice to Have:
  • Experience: 5+ years building and operating production systems, developer-facing APIs, or both

  • Strong experience with Java

  • Knowledge of columnar file, memory formats and OLAP databases

  • Background in high-growth startups

Location: This role is fully remote within Europe, excluding France.

Compensation & Benefits

We offer competitive compensation that includes base salary, meaningful equity, and benefits such as health and dental coverage, flexible vacation, a 401(k) plan, and life insurance. Actual compensation will vary based on role, level, and location. For team members in the EU and UK, we provide locally competitive benefits aligned with regional

Compensation Philosophy:

We offer competitive compensation that includes base salary, variable compensation for relevant roles, meaningful equity, benefits, and perks. Actual compensation and offerings will vary based on role, level, and location. Team members in the EU, UK, and APAC receive locally competitive benefits aligned with regional norms and regulations.

Benefits

Benefits include medical, dental, and vision coverage, flexible vacation, a 401(k) plan, meals on in-office days in the US and more.

Skills Required

  • Designing and operating data-intensive, high-throughput ingestion and trace-query systems
  • Building and maintaining CLIs, SDKs, and API standards (developer experience)
  • Production experience with PostgreSQL
  • Production experience with Redis
  • Production-level backend development experience in Go, Python, or TypeScript
  • Experience defining SDK generation and tooling across multiple languages
  • Knowledge of cloud object storage and containerized infrastructure
  • Kubernetes experience and familiarity with cloud platforms (GCP or AWS)
  • Hands-on experience with observability stacks (Datadog, Prometheus/Grafana, OpenTelemetry)
  • Operational mindset: on-call participation, monitoring, alerting, incident response and automation
  • 5+ years building and operating production systems or developer-facing APIs
  • Strong experience with Java
  • Knowledge of columnar file/memory formats and OLAP databases
  • Background in high-growth startups
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
123 Employees

What We Do

LangChain is the platform for building reliable agents. Our products power top engineering teams — from fast-growing startups like Lovable, Mercor, and Clay to global brands including AT&T, Home Depot, and Klarna. LangGraph is a low-level orchestration framework for building controllable agents and long-running workflows. It’s used in production by teams at Replit, Uber, LinkedIn, GitLab, and more. LangSmith offers unified evaluation and monitoring to help developers debug, evaluate, and improve their agents at scale. LangChain provides hundreds of integrations and composable components, making it easy to connect with the latest models, tools, and databases — with minimal engineering overhead. Together, these tools help teams build, deploy, and manage enterprise-grade agents, faster.

Similar Jobs

Apollo Next LTD Logo Apollo Next LTD

Junior Crypto Trader (Remote)

Blockchain • Fintech • Analytics • Financial Services • Cryptocurrency • Web3
Remote
13 Locations
57 Employees
2-5 Annually

Cloudflare Logo Cloudflare

Forward Deployed Engineer

Cloud • Information Technology • Security • Software • Cybersecurity
Remote or Hybrid
Sweden
4400 Employees

ServiceNow Logo ServiceNow

Enterprise Account Executive

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Stockholm, SWE
29000 Employees

Datadog Logo Datadog

Senior Sales Engineer

Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Easy Apply
Remote or Hybrid
4 Locations
6500 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account