Virtuos

Senior AI Engineer

Posted 5 Days Ago

Be an Early Applicant

2 Locations

In-Office or Remote

Senior level

Gaming • Software

The Role

Design and build internal agent platform and runtimes for LLM-driven tooling: agent loops, memory, evaluation, observability, and secure backend services. Implement APIs, data stores, RBAC, CI/CD, and integrate local/hosted LLMs. Collaborate cross-functionally to productionize agentic systems and maintain test- and eval-driven workflows.

Summary Generated by Built In

PLAY, GROW and WIN

To be a part of Virtuos means to be a creator. 

At Virtuos, we harness the latest technologies to make games better and more immersive than ever before. That is why we pride ourselves in constantly pushing the boundaries of possibility since our founding in 2004. 

Virtuosi is a team of experts – people who have come together to share their mutual passion for making and playing games. People with the same enthusiasm for exploring new ideas and the constant drive to excel in their field. People who believe in earning success through dedication.

At Virtuos, we are at the forefront of gaming, creating exciting new experiences daily. Join us to Play, Grow and Win – together.

ABOUT THE POSITION

We are looking for a Senior AI Engineer to help design and deliver agentic AI systems that power R&D tooling for video game asset pipelines and production workflows. You will help shape the technical direction of our internal agent platform and drive engineering practices around agent loops, memory, evaluation, and safe deployment of LLM-driven applications.

This is a senior, hands-on individual contributor role: you will write code, help design the agentic architecture, and partner with stakeholders across studios to turn emerging AI capabilities into production-grade tools.

Responsibilities

Agent platform
- Design and build key parts of our internal agent libraries - the core abstractions and developer ergonomics that let teams across the company build agents quickly and consistently.
- Help shape the architecture of our central agent runtime - the runtime, registry, and observability surface where agents are deployed, monitored, and governed.
- Help define and evolve the agent loop / harness: prompt orchestration, tool invocation, sub-agent delegation, and recovery behavior.
- Bring in reference patterns from the broader ecosystem (e.g. open-source agent loops and harness projects) and adapt them to our use cases.
Agent loop & harness engineering
- Drive prompting strategy at scale: system prompt design, guardrails, mitigation of context poisoning and pollution, and management of hyperparameters (context window sizing, lost-in-the-middle effects, temperature, top-k).
- Design tool interfaces for agents: MCP servers, structured inputs/outputs for context, and sub-agent composition patterns.
- Advocate for best practices for typed-agent frameworks, with first-class observability and telemetry baked into every agent.
- Evaluate and integrate local LLM options where latency, cost, or data-residency requirements demand it.
Agent memory
- Design and build the key parts of the memory layer used across our agents: conversation history management, context chaining, and episodic memory.
- Help define the boundary between short-term working context and long-term persistent memory, including decay/retention policies.
- Apply RBAC and tenant isolation to memory so agents can be safely shared across teams and projects.
- Test- and eval-driven development
- Build out the evaluation discipline for agentic systems: golden traces, regression evals, offline + online metrics, and red-team prompts.
- Build the harnesses and CI gates that let us iterate on prompts, models, and tools with confidence.
- Uphold evals as the unit of progress - no agent change ships without a measurable signal.
- Backend & platform foundations
- Design and build scalable backend services and secure RESTful APIs in Python (FastAPI), with strong data modeling across relational and non-relational stores.
- Enforce authentication/authorization (RBAC), input validation, and robust error handling for agent-facing endpoints.
- Implement caching, queues, and vector storage where the agent workload requires it.
Quality, delivery & collaboration
- Drive performance tuning, code reviews, and technical documentation within your area of the AI platform.
- Maintain CI/CD with Git/GitLab and Docker; ensure reproducible local-dev and deployment pipelines.
- Partner with UI/UX, production, SRE, IT, and game-team stakeholders to translate workflows into agentic solutions.
- Contribute to architectural decisions and share agentic-systems expertise with peers.
- Work within agile methodologies.

Qualifications

Foundation (must-have software-engineering baseline)
- 3+ years of professional experience building production applications, with recent depth in AI/LLM-based systems.
- Strong proficiency in at least one of Python, TypeScript, or JavaScript - Python expertise is required for our stack (FastAPI, Pydantic, SQLAlchemy or equivalent).
- Solid database skills across relational (PostgreSQL) and non-relational systems (e.g. MongoDB, vector databases); familiar with caching/queues (Redis) where applicable.
- Working knowledge of RBAC, authn/authz patterns, and secure API design.
- Comfortable with Git, GitLab CI/CD, and Docker/containers.
- Proven testing mindset and experience with automated test suites (e.g. pytest).
Agent loop / harness engineering
- Demonstrated experience designing and operating agent loops in production - not just prompt-tuning a chatbot.
- Deep, practical understanding of prompting: guardrails, context poisoning/pollution, and the hyperparameters that govern model behavior (context window size, lost-in-the-middle effects, temperature, top-k).
- Hands-on experience integrating tools into agents: MCP, structured I/O for context, and sub-agent orchestration.
- Experience with any agent development framework - e.g. LangChain, LangGraph, Claude Agent SDK, Pydantic AI, or comparable - is acceptable.
- Strong instincts for observability and telemetry in non-deterministic systems.
Agent memory
- Practical experience implementing memory for agents: history compaction, context chaining, episodic memory, and short-term vs long-term separation.
- Familiarity with retention/decay strategies and applying RBAC to multi-tenant memory.
Evaluation & quality
- Experience with test- and eval-driven development for LLM systems: building eval sets, regression suites, and CI gates around model/prompt changes.
Communication
- English communication is a MUST - strong written and verbal English is required, and fluency is a significant plus given our globally distributed teams.
- Comfortable communicating technical decisions and tradeoffs across cross-functional stakeholders.
Nice to have
- Experience running local LLMs (e.g. via vLLM, Ollama, llama.cpp) and reasoning about the cost/latency/quality tradeoffs vs hosted models.
- Contributions to or familiarity with open-source agent harnesses (e.g. OpenCode, OpenClaw, etc).
- Experience with agent development frameworks (LangChain/LangGraph/Claude Agent SDK/Pydantic AI) beyond prototype stage.

About Us

Founded in 2004, Virtuos is one of the largest independent video game development companies. We are headquartered in Singapore with offices in Asia, Europe, and North America. Specializing in full-cycle game development and art production, we have delivered high-quality content for more than 1,500 console, PC, and mobile games. Our clients include 23 of the top 25 gaming companies worldwide.

About Our Team

Launched in 2023, Virtuos Labs – Prague specializes in a diverse array of solutions, including networking, UI development, platform solutions and optimization. It is part of the Virtuos Labs network and works alongside sister studios to complement their respective specializations in R&D, proprietary game engine development, and graphic rendering.

About the Team

Flexible working hours and Home Office
Full-time Employment (HPP) so you can focus on your work
Meal allowance (paid with your salary)
Multisport card
Hard-skills and soft-skills development
Work on exciting big game projects (mostly AAA titles)
Learn new technologies and practices specific to game development
Your work will be seen and your name will be in the game
An environment where we share our enthusiasm for gaming
You will be part of our newly created branch of Virtuos

People matter. Diverse opinions and experiences matter. At Virtuos, our talented teams are the cornerstone of our success, and we recognize that fostering and advocating for inclusivity is at the center of what we do best - we make games better, together. Virtuos is proud to be an equal opportunity employer that embraces diversity of thought, expression, culture, and backgrounds.

Skills Required

3+ years professional experience building production applications with recent depth in AI/LLM systems
Python expertise (FastAPI, Pydantic, SQLAlchemy or equivalent)
Strong proficiency in TypeScript or JavaScript (one of these plus Python proficiency expected)
Solid database skills: PostgreSQL and non-relational systems (e.g., MongoDB) and familiarity with vector databases/storage
Familiarity with caching and queues (e.g., Redis)
Working knowledge of RBAC, authentication/authorization patterns, and secure API design
Experience with Git, GitLab CI/CD, and Docker/containers
Proven testing mindset and experience with automated test suites (e.g., pytest)
Demonstrated experience designing and operating agent loops in production (tool invocation, sub-agent orchestration, prompt orchestration)
Deep practical understanding of prompting strategies and LLM hyperparameters (context window, temperature, top-k, mitigation of context poisoning)
Hands-on experience integrating tools into agents (MCP, structured I/O, sub-agent composition)
Experience implementing agent memory: history compaction, context chaining, episodic memory, retention/decay policies
Experience with test- and eval-driven development for LLM systems: eval sets, regression suites, CI gates
Strong English written and verbal communication
Experience running local LLMs (vLLM, Ollama, llama.cpp) and reasoning about cost/latency/quality tradeoffs
Contributions to or familiarity with open-source agent harnesses (e.g., OpenCode, OpenClaw)
Experience with agent development frameworks beyond prototype stage (LangChain, LangGraph, Claude Agent SDK, Pydantic AI)