Articul8 AI

Staff Applied AI Researcher - Agentic Reasoning Systems (India)

Reposted 2 Days Ago

Be an Early Applicant

Dublin, CA, USA

In-Office

Senior level

Artificial Intelligence • Software

The Role

Lead research on agentic reasoning systems and runtime intelligence, architect infrastructure for AI research, unify perception, reasoning, and evaluation methods to improve decision-making for autonomous systems.

Summary Generated by Built In

About us:

Articul8 was born from a simple belief: GenAI should work for the enterprise, not the other way around. Our platform combines domain-specific models, autonomous agentic reasoning through ModelMesh(TM), reliable model evaluation through LLM-IQ(TM), and multimodal understanding to serve regulated industries including energy, semiconductor, finance, aerospace, and supply chain. Trusted by Fortune 500 enterprises, we bring together research, engineering, product, and domain expertise to deliver AI that meets the accuracy, explainability, and auditability standards that high-stakes environments demand.

Job Description:

Articul8 AI is seeking a Staff Applied AI Researcher to define how our platform reasons at runtime and how autonomous systems make trustworthy decisions in production. You will lead research across the core runtime intelligence capabilities behind ModelMesh(TM): task decomposition, agent coordination, model and tool routing, probabilistic decisioning, verification, observability-aware execution, and the evaluation methods that determine whether autonomous behavior is reliable enough for enterprise use.

Responsibilities:

Set technical direction for agentic reasoning systems and runtime intelligence across ModelMesh™ — define the orchestration strategies, decision policies, verification approaches, and runtime quality standards that determine how massively parallel agent systems reason, coordinate, and self-correct in production
Architect the infrastructure for researcher augmentation at scale — design the agentic platforms and orchestration primitives that enable every researcher and engineer at Articul8 to deploy fleets of AI agents for experimentation, evaluation, and production integration — multiplying the depth, breadth, and velocity of the entire organization
Go deep: advance the science of autonomous reasoning — design, train, and refine the learned components behind runtime decisioning (routing models, verification models, confidence estimators, reward models, policy selectors), using massively parallel agent-driven experiment pipelines to explore architectural and algorithmic frontiers exhaustively
Go broad: unify perception, retrieval, reasoning, and action — build repeatable methodology for composing domain-specific models, data perception systems, knowledge graphs, retrieval layers, and external tools into coherent agentic workflows, delegating integration testing and cross-modal benchmarking to parallel agent systems so you can reason across the full stack simultaneously
Drive research on agent reliability for regulated environments — lead failure detection, self-checking, verification workflows, compounding error analysis, and auditable autonomous behavior research, using agent-orchestrated stress testing and red-teaming at scales that manual evaluation cannot reach
Define evaluation methodology for runtime intelligence — establish how task success, decision quality, robustness, traceability, and failure recovery are measured under realistic enterprise conditions, building agentic evaluation harnesses that run continuously and surface regressions before they reach customers
Influence platform-level architecture — shape decisions on model routing, tool use, observability, governance, access control, and interoperability with external agent ecosystems, ensuring the platform is designed for humans and agents to amplify each other
Mentor researchers across levels in the agentic paradigm — raise the bar on technical judgment, experimental rigor, and agent-augmented research practice; contribute to hiring researchers who are driven to maximize their human potential
Maintain hands-on research impact — sustain a meaningful personal research contribution through technical work, publications, patents, and externally visible output, modeling what it looks like to be a deeply technical leader who uses agentic systems to go deeper and faster than ever before

Required Qualifications:

Education: PhD or MSc in Computer Science, Machine Learning, AI, Robotics, or a related field.
Experience: 8+ years in AI/ML research with demonstrated impact on production systems, including 3+ years building LLM-based or autonomous AI systems.
Reasoning and orchestration: Deep hands-on experience in at least two of: multi-agent coordination, planning under uncertainty, sequential decision-making, probabilistic inference, model routing, or tool-using agent systems. You've built systems where multiple models must collaborate to produce a reliable outcome.
Evaluation of autonomous systems: You have designed evaluation frameworks for systems where correctness is not binary — measuring decision quality, reliability under distribution shift, compounding error rates, and failure recovery in production-like conditions.
Systems at scale: You have designed and operated research systems that integrate multiple models, data sources, and control mechanisms in production or near-production settings. You understand the difference between a demo and a system.
Software engineering: Proficient in Python with strong software architecture instincts. Your systems are maintainable, testable, and operable.
Technical leadership: You have set technical direction for a research area, mentored researchers, and influenced quality standards beyond your immediate team.

Preferred Qualifications:

Experience building orchestration systems with non-trivial control flow — dynamic routing, verification loops, probabilistic gating — not just prompt chaining or fixed DAGs.
Background in probabilistic modeling, Bayesian inference, control theory, or formal verification applied to ML systems — you can reason about uncertainty, not just measure it.
Experience with reliability engineering for autonomous AI in regulated environments: observability, safety constraints, graceful degradation, and audit trails.
Track record of integrating heterogeneous components (retrieval, knowledge graphs, domain models, external APIs) into systems that are more reliable than their individual parts.
Strong publication record with evidence of sustained, focused research impact — not just breadth.
Experience taking reasoning or agent systems from prototype to production serving real enterprise customers.
Domain familiarity in energy, semiconductor, finance, aerospace, telecom, or supply chain.

Professional Attributes (Code42):

Practice Humility: You recognize that setting technical direction is a responsibility, not a status. You change your mind publicly when the evidence demands it and build a team culture where the best idea wins regardless of who proposed it.
Bias for Outcomes: You define success by customer and platform impact, not research novelty alone. You make hard prioritization calls and hold yourself accountable for whether the team's work moved the needle.
Care Deeply: You take personal responsibility for the reliability and trustworthiness of the systems your team builds. You invest in the people around you — their growth, their clarity, their ability to do their best work — because that's how real quality is sustained.
Dare to Do the Impossible & Embrace Scarcity: You take on problems that don't have known solutions and structure them into tractable research programs. You don't wait for perfect resources — you build with what you have and make the case for what you need with results, not requests.
Build a Better World: You ensure that the autonomous systems you build are worthy of the trust enterprises place in them. You hold the team to standards of auditability, reliability, and fairness that go beyond what's required — because you believe the bar should be set by builders, not regulators.

Skills Required

PhD or MSc in Computer Science, Machine Learning, AI, Robotics, or a related field
8+ years in AI/ML research with impact on production systems
3+ years building LLM-based or autonomous AI systems
Deep hands-on experience in multi-agent coordination, planning under uncertainty, and sequential decision-making
Designed evaluation frameworks for systems where correctness is not binary
Designed and operated research systems in production settings
Proficient in Python with strong software architecture skills
Set technical direction for a research area and mentored researchers

View all jobs at Articul8 AI

View Articul8 AI Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Dublin, California

58 Employees

Year Founded: 2024

What We Do

Articul8 AI is a technology company whose products transform enterprise data and expertise into powerful engines of growth, value and impact. Our full-stack GenAI platform is revolutionizing how enterprises harness their data and expertise to build expert-level Generative AI applications for their mission-critical challenges. Our products deliver enterprise-scale impact with ROI in hours to weeks. General-purpose GenAI models, while necessary, are not sufficient to deliver enterprise-specific decisioning and actioning. Our platform addresses this gap by making it straightforward for companies to build sophisticated, enterprise-scale and expert-level GenAI applications that encode their domain expertise. Our proprietary technology does the heavy lifting through autonomous decisions and actions, automated data intelligence, improved precision and relevance with industry knowledge encoded into Articul8's library of domain and task-specific models. We are purpose-built for regulated industries and meet the highest standards of compliance, data security, privacy and performance, including traceability and auditability at every step. We are trusted by leading global enterprises like AIAA, Itochu Techno-Solutions Corporation, Uptycs, AWS, NIQ, Intel and Franklin Templeton to transform their mission-critical work. We are the enterprise GenAI platform that simply works! For more information, please visit www.articul8.ai.