Cube Asia

AI Engineer

Reposted 8 Days Ago

Be an Early Applicant

Hiring Remotely in Bangkok, Phra Nakhon, Bangkok, THA

Remote

Mid level

eCommerce • Retail

The Role

Build and maintain self-hosted LLM features and infrastructure: chat interfaces, MCP server, RAG pipelines (ingestion, embeddings, vector stores, retrieval, reranking), prompts, evaluations, observability, and secure integrations. Work with Infrastructure and Data Engineering to deploy and operate production LLM systems and ship end-to-end AI features.

Summary Generated by Built In

As one of our early AI Engineering hires, you'll help define what AI at Cube looks like. You'll build the AI features people actually use from our self-hosted chat interface and MCP server to retrieval pipelines, prompts, evaluations, and integrations with internal systems. You'll work closely with our Infrastructure and Data Engineering teams to design architecture, connect systems, and transform emerging AI capabilities into practical products and tools that solve real problems every day.

Maintain and tunning our self-hosted chat interface including model connections, MCP integration, RAG/knowledge base setup
Build the RAG pipeline: ingestion, chunking, embeddings, vector store, retrieval, reranking, and evaluation
Integrate LiteLLM or OpenRouter as the gateway; handle routing, fallbacks, rate limits, and cost tracking
Maintain and configure MCP server and the tools it exposes to the model
Write prompts and evaluations, and iterate on them based on real usage and failure cases
Monitoring the logging, tracing, and guardrails of our AI platforms and model does.
Good to have exposure on MLOps/Platform team to deploy self-hosted models (vLLM, TGI, Ollama) and keep them healthy
Ship features end-to-end: API, retrieval, prompt, evaluation, and rollout

Requirements

4+ years of software engineering experience
Familiarity with containerized technologies and orchestration platforms such as Kubernetes
Strong interest in AI, LLMs, and the rapidly evolving model ecosystem
1+ years of experience building, deploying, or supporting production LLM systems (RAG, agents, or fine-tuned models)
Experience deploying and configuring self-hosted LLM chat interfaces (Open WebUI preferred; similar platforms are acceptable)
Hands-on experience with retrieval and RAG systems, including embeddings, vector databases, chunking strategies, hybrid search, and evaluation methodologies
Experience working with LLM gateways or routing layers such as LiteLLM, OpenRouter, Portkey, or similar solutions
Experience serving open-weight models using tools such as vLLM, TGI, or SGLang
Experience designing and implementing secure integrations between LLMs and internal business systems
Nice to have: Experience with or understanding of MCP servers, agent frameworks, or tool-calling architectures
Nice to have: Experience with or understanding of LLM observability and monitoring platforms such as LangSmith, Langfuse, or similar tools

Skills Required

4+ years of software engineering experience
Familiarity with containerized technologies and orchestration platforms such as Kubernetes
Strong interest in AI, LLMs, and the model ecosystem
1+ years building, deploying, or supporting production LLM systems (RAG, agents, or fine-tuned models)
Experience deploying and configuring self-hosted LLM chat interfaces (Open WebUI preferred)
Hands-on experience with retrieval and RAG systems: ingestion, chunking, embeddings, vector DBs, hybrid search, reranking, and evaluation
Experience working with LLM gateways or routing layers such as LiteLLM, OpenRouter, Portkey, or similar
Experience serving open-weight models using vLLM, TGI, Ollama, or similar
Experience designing and implementing secure integrations between LLMs and internal business systems
Exposure to MLOps/platform deployment of self-hosted models (vLLM, TGI, Ollama) and operational monitoring
Experience with or understanding of MCP servers, agent frameworks, or tool-calling architectures
Experience with LLM observability and monitoring platforms such as LangSmith or Langfuse

View all jobs at Cube Asia

View Cube Asia Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

27 Employees

Year Founded: 2022

What We Do

We offer granular market data and insights for e-commerce in Southeast Asia, helping brands and retail companies drive profitable growth for their online businesses. Our Cube Subscription plans allow brands, retailers, consulting firms and investors to access the most accurate, granular and dynamic e-commerce intelligence in the market. Current clients include leading consumer brands, retailers and private equity firms. Cube Asia was previously known as Chalawan (chalawan.asia).