Credo AI is a venture-backed company on a mission to empower organizations to responsibly build, adopt, procure and use AI at scale. Credo AI has built a pioneering platform for context-driven AI governance, AI risk assessment and compliance (to regulations like the EU AI Act and standards like NIST AI RMF, ISO 42001 etc) to ensure compliant, fair, and auditable development and use of AI. Our goal is to move responsible AI development from an “ethical” choice to an obvious one-by ensuring AI’s benefits are universally accessible while addressing the full spectrum of its risks. We aim to do this both by making it easier for organizations to integrate responsible AI Governance practices into their AI development and by collaborating with regulators/policymakers to set up appropriate ecosystem incentives. Founded in 2020, Credo AI has been recognized as a one of the Most Innovative Companies of 2024 by Fast Company, a Technology Pioneer by the World Economic Forum, named to the CBInsights' AI 100 List and World's Most Promising Startups list, and included in Fast Company’s Next Big Thing in Tech and Intelligent Applications Top 40 by Madrona, Goldman Sachs, Microsoft and Pitchbook.
About UsCredo AI is a leading AI governance platform helping enterprises implement responsible AI at scale. We are building the next generation of AI-powered governance solutions that enable organizations to manage risk, ensure compliance, and scale their AI deployments with confidence.
About the RoleThis is a senior technical role building the AI systems that power Credo AI's governance products. You will design and build intelligent systems that observe, reason about, and act on governance knowledge — from the agent infrastructure that monitors AI behavior in enterprise deployments, to the knowledge systems that make governance intelligence accessible and actionable at scale.
You are someone who cares about how AI systems behave, not just whether they perform. You think carefully about reliability, consistency, and failure modes — and you have the engineering discipline to turn that thinking into production systems. You work across the full stack of modern AI engineering: LLM-based agents, retrieval and knowledge systems, evaluation pipelines, and the behavioral configuration layers that shape how AI acts within defined constraints.
You will write code, ship systems, and own technical direction — while collaborating closely with AI researchers, governance experts, and product teams.
What You'll BuildAI agent systems
Design and implement agent architectures that reason about governance policies and take action within defined constraints
Build instrumentation that exposes how agents reason and act, making their behavior auditable at the session level
Design and build the telemetry, filtering, and analytics infrastructure that lets governance owners empirically verify how agents are behaving at organizational scale
Develop behavioral configuration and constraint systems that encode organizational policies into how AI agents operate
Architect evaluation frameworks that measure whether agent behavior actually aligns with governance intent
RAG - Governance knowledge systems
Develop retrieval and context systems that surface relevant governance knowledge at the right moment, to the right system or user
Design hybrid retrieval architectures combining semantic search, structured knowledge traversal, and dynamic context assembly
Platform infrastructure
Contribute to the broader AI systems architecture underlying Credo AI's platform
Work with data and product teams to translate governance intelligence into reliable, scalable product features
Establish engineering standards and best practices for AI system development across the team
You have built LLM-based systems in production and you have encountered their failure modes firsthand — inconsistency, instruction-following breakdowns, unexpected behaviors at distribution edges. You think carefully about how to specify, evaluate, and constrain AI behavior, and you bring engineering rigor to problems that sit at the boundary of ML research and systems design.
You read the literature on agent architectures, evaluation methodology, and alignment-adjacent topics not because your job requires it but because you find them genuinely useful. You move fast, ship things, and iterate — but you think carefully about failure modes before they reach production.
Minimum Qualifications5+ years building production AI/ML systems, with meaningful experience shipping LLM-based applications
Strong experience with agent architectures: tool use, planning, multi-step reasoning, and the failure modes that accompany them
Evaluation mindset — you have designed evals, run them, and used results to make systems meaningfully better
Experience with behavioral shaping techniques: prompt architecture, output validation, policy-grounded constraints
Solid systems engineering: you can design data pipelines, APIs, and distributed systems that hold up in production
Experience building monitoring or observability systems for AI applications in production
Experience with retrieval-augmented generation and tradeoffs in hybrid retrieval systems.
Strong communicator and collaborator
Research background or publications in LLM evaluation, alignment, agent safety, or AI robustness
Experience with red-teaming, adversarial evaluation, or automated failure detection
Background in multi-agent systems or systems where multiple AI components interact
Familiarity with AI governance, compliance, or risk management domains
Track record contributing to open-source ML or evaluation infrastructure
We don't expect any single candidate to meet every qualification listed. If you bring deep experience in some of these areas, sharp engineering judgment, and the curiosity to grow into the rest, we want to hear from you.
Why This RoleThe AI systems problems at Credo AI are not solved problems. How do you reliably make AI agents behave within governance boundaries? How do you evaluate whether a governance-reasoning AI system is actually getting it right? You will be working on these questions in a domain — enterprise AI governance — where the right architectural decisions are still being made, with close collaboration between engineering, research, and deep domain expertise.
*Supports EST/CST/PST time zone (your choice)
The expected base salary range for this position is ₹40–50L. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position in the specified location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.
Location & Remote CultureWhile this is a remote role and we're a fully distributed team, we routinely meet up in-person. We support individual members to coordinate in-person coworking whenever possible, and organize company-wide offsites multiple times a year. At Credo AI we value diversity, equity, and inclusion as core principles in our work environment, and the development of our product offerings, and we have implemented initiatives to foster and support these values.
Credo AI Benefits & Perks
Competitive Salary and Equity
Health: We offer health, dental, and vision coverage. We also offer an ergonomic benefit to cover the costs of equipment to help staff stay healthy while working, both in the office and at home.
Coworking: We will cover the cost of co-working spaces like WeWork and in-person meetups.
Unlimited PTO: Credo AI has unlimited time off to support our employees
Generous Parental Leave: We offer up to 12 weeks of paid parental leave.
401(k) plan for employees (US only)
Skills Required
- 5+ years building production AI/ML systems with experience shipping LLM-based applications
- Strong experience with agent architectures including tool use, planning, multi-step reasoning, and related failure modes
- Designed and executed evaluations, using results to improve system performance
- Experience with behavioral shaping techniques: prompt architecture, output validation, policy-grounded constraints
- Solid systems engineering skills: designing data pipelines, APIs, and distributed production systems
- Experience with retrieval-augmented generation and hybrid retrieval system tradeoffs
- Strong communication skills to translate technical tradeoffs to non-engineering stakeholders
- Research background or publications in LLM evaluation, alignment, agent safety, or AI robustness
- Experience building monitoring or observability systems for AI applications in production
- Familiarity with knowledge graphs, ontologies, or structured knowledge systems
- Experience with red-teaming, adversarial evaluation, or automated failure detection
- Background in multi-agent systems or interacting AI components
- Familiarity with AI governance, compliance, or risk management domains
- Track record contributing to open-source ML or evaluation infrastructure
What We Do
Credo AI is on a mission to empower enterprises to responsibly build, adopt, procure, and use AI at scale. Credo AI’s cutting-edge AI governance platform automates AI oversight and risk management while enabling regulatory compliance to emerging global standards like the EU AI Act, NIST, and ISO. Credo AI keeps humans in control of AI for better business and society. Founded in 2020, Credo AI has been recognized as a CBInsights AI 100, Technology Pioneer by the World Economic Forum, Fast Company’s Next Big Thing in Tech, and a top Intelligent App 40 by Madrona, Goldman Sachs, Microsoft, and Pitchbook. To learn more, visit: credo.ai.







