- Technical Leadership & Architecture
- Define and own the technical vision for agentic AI systems across the platform
- Architect scalable multi-agent systems, orchestration frameworks, MCP server infrastructure, retrieval and memory pipelines, and observability layers
- Drive architectural decisions related to MCP/ tool ecosystems, AI platform design, and LLMOps infrastructure
- Evaluate emerging AI technologies, frameworks, and models to influence engineering and product roadmaps
- Create and maintain Architecture Decision Records (ADRs) and technical standards
- Engineering & Delivery
- Design and develop critical AI platform components and infrastructure
- Establish AI engineering best practices and discipline across the organisation - design patterns, evaluation practices, prompt engineering, reliability standards, governance, and cost optimization
- Lead cross-functional technical initiatives to improve AI system quality, reliability and scalability
- Collaborate with platform, infrastructure, and data engineering teams to embed AI-driven automation into cloud operations workflows
- Mentorship & Technical Community
- Mentor Lead and Staff AI Engineers through architecture reviews, design discussions, and problem-solving sessions
- Conduct rigorous technical reviews of designs, architectures, and major code contributions
- Contribute to MontyCloud’s technical brand through technical writing, open-source contributions, or speaking engagements
- Innovation & Strategic Impact
- Identify opportunities where agentic AI can create significant product or operational improvements
- Build prototypes, technical proposals, and proof-of-concepts to validate new ideas
- Stay current with advancements in AI research, agentic frameworks, and LLMOps practices
- Agentic AI & Multi-Agent Systems
- Production-grade agentic AI system design and development
- Agentic AI System Design & Architecture - Multi-agent architectures and orchestration, Agent-to-agent communication, Agent memory and planning strategies, Tool integration and MCP server design
- Agent orchestration frameworks - LangGraph, Strands Agents, CrewAI, AutoGen, or equivalent agentic AI frameworks
- LLMOps & AI Platform Engineering
- AI Governance & Lifecycle Management - Prompt versioning and governance, evaluation frameworks, regression detection
- AI Observability & Monitoring - Output quality monitoring, Agent tracing and observability
- AI Cost Management - Cost governance for high-scale AI workloads
- Cloud & Infrastructure
- Cloud AI Platforms & Services - AWS cloud ecosystem, AWS Bedrock, AgentCore
- Cloud-Native Infrastructure & Deployment - Cloud-native AI deployments, Kubernetes, Docker
- Infrastructure as Code (IaC) - Terraform
- Foundation Models & AI Integrations
- Foundation model API integration - OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Hugging Face
- MCP and AI tool integration architecture
- RAG & Knowledge Systems
- Retrieval-Augmented Generation (RAG) and Graph-RAG architectures
- Embedding strategies, retrieval and reranking systems
- Knowledge graph integrations
- Technical Leadership and Communication
- Cross-team technical influence
- Technical communication and documentation
- Organization-level engineering ownership
- Proactive problem identification and resolution
- Domain Experience
- AI systems for cloud operations and infrastructure automation
- Developer tooling platforms
- AI Deployment & Optimization
- Serverless AI deployment patterns
- AI inference cost optimization
- Advanced AI Techniques Exposure
- Model fine-tuning and RLHF
- Advanced model evaluation techniques
- Industry & Community Exposure
- AI-first or cloud-native product company
- Open-source contributions, technical blogs, conference talks, or published research in AI/agentic systems
- 12+ years of overall software engineering experience
- Prior experience in a Principal Engineer role or equivalent individual contributor (IC) role
- Significant recent hands-on experience building and deploying applied AI systems in production environments
- Proven track record of leading large-scale technical initiatives across multiple teams or product areas
- Demonstrated expertise in architecting enterprise-scale AI platforms and cloud-native AI workloads
- Experience mentoring senior engineers and influencing technical strategy at an organizational level
- Bachelor’s or Master’s degree in Computer Science / Artificial Intelligence / Machine Learning / Engineering / or any related technical discipline
- Equivalent practical experience in advanced AI system design and distributed cloud platforms may also be considered
Skills Required
- 12+ years of software engineering experience
- Prior experience in a Principal Engineer role
- Hands-on experience building and deploying applied AI systems
- Expertise in architecting enterprise-scale AI platforms
- Experience mentoring senior engineers
- Bachelor's or Master's degree in a related field
MontyCloud Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about MontyCloud and has not been reviewed or approved by MontyCloud.
-
Fair & Transparent Compensation — Feedback suggests compensation and benefits are viewed favorably overall, indicating competitive pay positioning for many roles.
-
Healthcare Strength — Job postings indicate medical, dental, and vision coverage as part of a comprehensive package in the U.S.
-
Equity Value & Accessibility — Listings highlight equity participation as a standard component, signaling accessible ownership opportunities for employees.
MontyCloud Insights
What We Do
MontyCloud is a Seattle, WA based intelligent Cloud Management Platform Company. Our customers use MontyCloud DAY2™ to instantly close the cloud skills gap, simplify CloudOps, and reduce the total cost of cloud operations up to 70%, all in just a few clicks. By leveraging the AWS public cloud, AI, and ML, DAY2 ™ simplifies provisioning, security, compliance, cost optimization, and routine operations. DAY2™’s automation first, No-Code approach helps customers immediately derive deep insights and deliver intelligent Cloud Operations in just a few minutes. You can try the platform for free at https://MontyCloud.com









