Tensormesh Jobs

Senior MLOps Engineer (Vietnam)

Tensormesh

Senior MLOps Engineer (Vietnam)

Reposted 14 Days Ago

Be an Early Applicant

Hiring Remotely in VN

Remote

Senior level

Artificial Intelligence • Machine Learning • Software • Infrastructure as a Service (IaaS)

The Role

Design and operate GPU/ML CI/CD and release pipelines (GitHub Actions, self-hosted H100/A100 runners). Implement packaging, multi-arch containers, continuous benchmarking/performance gates, contributor-friendly CI, and infrastructure-as-code with security and autoscaling.

Summary Generated by Built In

JOIN US – BUILD THE FUTURE OF AI WITH TENSORMESH.AI FROM VIETNAM!

Tensormesh.ai – một startup AI đình đám tại Mỹ được spinoff từ dự án mã nguồn mở LMCache, đang trên đà định hình lại cách thế giới hiểu và triển khai AI hiệu năng cao – đang chính thức mở rộng và xây dựng team Core Engineering tại Việt Nam! Chúng tôi tin rằng Việt Nam xứng đáng là trung tâm R&D cốt lõi cho khu vực Đông Nam Á, và bạn có thể là một phần quan trọng trong hành trình đó.

We are looking for: MLOps Engineer — LMCache (Open-Source Infrastructure)

1. What You'll Own

- Pipeline architecture: GitHub Actions workflows + self-hosted GPU runner fleet (H100/A100); multi-stage pipeline from lint → unit → GPU integration → cross-framework compat (vLLM/SGLang) → performance regression

- Release engineering: semantic versioning, PyPI publishing, multi-arch container images, Helm charts, Sigstore/cosign signing, coordination with downstream integrators

- Performance gates: Continuous benchmarking that blocks regressions in cache hit rate, TTFT, throughput, memory before merge

- Contributor experience: fast PR feedback, eliminate flakiness, dev containers that don't require expensive GPUs

- Security & IaC: SBOM/SLSA provenance, secret rotation, runner fleet via Terraform with cost-optimized autoscaling.

2. Required

- 4+ years MLOps/DevOps/SRE; 2+ years CI/CD for GPU or ML workloads

- Deep GitHub Actions expertise (workflows, composite actions, self-hosted runners at scale)

- Python packaging & PyPI release flow (incl. wheels with native extensions)

- Docker multi-stage/multi-arch; NVIDIA Container Toolkit

- Terraform/Ansible for cloud GPU infrastructure

- Track record building CI that contributors trust — fast, non-flaky, clear failures

3. Strongly Preferred

Maintainer/contributor experience on a popular OSS project

Familiarity with vLLM, SGLang, NVIDIA Dynamo, KServe, or Triton

Kubernetes in CI (Kind/k3s, multi-node integration tests)

Continuous benchmarking tools + time-series perf tracking

Supply chain security (Sigstore, SLSA, syft/grype)

RDMA / high-perf networking / P2P system testing

Tại sao chọn Tensormesh.ai?

* Làm việc trực tiếp với engineer team tại Mỹ và Việt Nam – sản phẩm bạn build sẽ được dùng bởi các công ty AI hàng đầu thế giới.

* Mức đãi ngộ cạnh tranh toàn cầu.

* Văn hóa engineering-first, không rào cản, không bureaucracy – chỉ có code, impact và learning.

* Linh hoạt remote/hybrid.

Ứng tuyển ngay! Gửi CV + GitHub/LinkedIn về: [email protected] hoặc [email protected]

Hoặc tag ngay người bạn nghĩ "xứng đáng làm core engineer cho một startup AI toàn cầu!"

hashtag

Skills Required

4+ years MLOps/DevOps/SRE; 2+ years CI/CD for GPU or ML workloads
Deep GitHub Actions expertise (workflows, composite actions, self-hosted runners at scale)
Python packaging & PyPI release flow (including wheels with native extensions)
Docker multi-stage/multi-arch
NVIDIA Container Toolkit
Terraform or Ansible for cloud GPU infrastructure
Proven track record building fast, non-flaky CI with clear failures

View all jobs at Tensormesh

View Tensormesh Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

Year Founded: 2025

What We Do

Tensormesh is an AI infrastructure optimization company that provides distributed AI compute infrastructure, including GPU clusters and inference optimization platforms, to reduce GPU costs and latency.