- Must have hands-on experience managing production-grade workloads across 150+ AWS accounts with $250K+/month cloud spend.
- Must have performed 8+ customer assessments such as Formal Technical Reviews (FTRs) or Well-Architected Framework Reviews (WAFRs) with documented optimization outcomes.
- Must have designed and built reusable automation frameworks for cloud operations (cost, security, compliance, governance) that were adopted across multiple teams or customers.
- Must have contributed to internal tooling or platform capabilities beyond one-off solution delivery.
- Must demonstrate a pattern of identifying repetitive operational problems and automating them into reusable solutions.
- Must have hands-on experience designing and implementing AI-powered solutions using generative AI services (Amazon Bedrock, Bedrock Agents, or equivalent), including agentic workflows and AI orchestration patterns.
- Own and evolve cloud operations capability domains, driving both platform capabilities and internal best practices across areas such as cost optimization, security automation, governance, compliance, and AI-driven operations.
- Define how your capability domain works end-to-end: what gets automated, how it scales, and what the platform delivers to partners and customers.
- Design and build reusable automation frameworks on top of cloud provider services that reduce undifferentiated operational work across large multi-account environments.
- Own delivery of technically complex capabilities spanning multiple cloud services, requiring deep expertise in security, cost, compliance, or governance.
- Automate operations using Python, CloudFormation, Systems Manager, and infrastructure-as-code tooling for proactive scaling, cost-triggered remediations, and security auto-remediation.
- Implement measurable cost optimizations across large cloud footprints through rightsizing, commitment coverage analysis, idle resource elimination, and anomaly detection.
- Define and enforce security baselines, encryption standards, and least-privilege access patterns through automated audits and guardrails.
- Design and build AI-powered cloud operations capabilities using Amazon Bedrock, including agentic workflows with Bedrock Agents, custom orchestrators for complex task automation, and Model Context Protocol (MCP) integrations.
- Define how AI and generative AI capabilities are applied within your capability domains to automate decision-making, anomaly detection, remediation, and operational intelligence.
- Build and validate agentic AI patterns that can operate across multi-account cloud environments at scale.
- Evaluate new foundation models, AI services, and orchestration frameworks for applicability to cloud operations automation.
- Evaluate emerging cloud provider services, build advanced proof-of-concept implementations, and determine which capabilities should be productized.
- Own the lifecycle from identifying cloud operations problems through building validated solutions to translating them into product-ready specifications for the engineering team.
- Set the technical bar for the team. Define best practices, review architectural decisions, and ensure consistency across deliverables.
- Work directly with Product Management to deliver feature specifications grounded in real cloud environment behavior.
- Collaborate with Platform Engineering to translate validated R&D patterns into production-grade implementations.
- Partner with Site Reliability Engineering on infrastructure governance, incident response, and cloud spend optimization.
- Contribute to thought leadership through technical blogs and documentation.
- You identify repetitive cloud operations problems and automate them into reusable capabilities that reach thousands of cloud accounts.
- You design AI-powered CloudOps solutions and agentic workflows that transform how the platform delivers value to partners.
- You translate deep cloud expertise into platform features, not one-off deliverables.
- You proactively explore new cloud provider services and AI capabilities, turning them into practical, validated approaches before the rest of the organization asks for them.
- Other team members and the platform itself depend on the frameworks and patterns you build.
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 10+ years of experience in cloud operations and engineering, with deep focus on AWS services.
- Proven track record of building reusable automation and tooling beyond individual engagement delivery.
- Strong proficiency in AWS services such as EC2, S3, RDS, Lambda, Organizations, Control Tower, Security Hub, Config, Cost Explorer, and Bedrock.
- Expert-level scripting in Python or Bash. Experience with CloudFormation, Terraform, or CDK.
- Hands-on experience with Amazon Bedrock, Bedrock Agents, agentic AI workflows, and generative AI application design.
- Deep understanding of multi-account cloud architecture, IAM design, and governance at scale.
- Strong analytical and problem-solving skills with the ability to work independently on ambiguous, complex problems.
- Excellent written and verbal communication skills.
- AWS Certified Solutions Architect (Professional) required. AWS Certified AI Practitioner or Machine Learning Specialty preferred. Additional specialty certifications (Security, FinOps, Networking) preferred.
- Experience with FinOps platforms such as CloudHealth, Apptio Cloudability, or equivalent.
- Experience with Model Context Protocol (MCP) servers, AI orchestration frameworks, and custom agent design.
- Familiarity with DevOps practices, CI/CD pipelines, and chaos engineering.
- Experience contributing to platform tooling or open-source cloud operations projects.
- Knowledge of compliance frameworks: SOC 2, HIPAA, PCI-DSS, CIS Benchmarks.
- Own capability domains whose impact reaches a global MSP partner network and thousands of cloud accounts.
- Design and build AI-powered CloudOps solutions and agentic workflows that shape how the industry approaches cloud operations.
- Build reusable frameworks and automation that other engineers and the platform depend on.
- Work directly with Product, Engineering, and SRE teams on platform capabilities.
- Operate as a technical authority in a team where depth and automation are valued over hierarchy and people management.
- Enjoy a flexible, hybrid work culture that supports work-life balance.
Skills Required
- 10+ years of experience in cloud operations and engineering
- Bachelor's degree in Computer Science, Information Technology, or a related field
- AWS Certified Solutions Architect (Professional)
- Expert-level scripting in Python or Bash
- Hands-on experience with Amazon Bedrock and agentic AI workflows
- Strong proficiency in AWS services such as EC2, S3, RDS, Lambda
- Must have performed customer assessments such as Formal Technical Reviews (FTRs) with documented optimization outcomes
MontyCloud Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about MontyCloud and has not been reviewed or approved by MontyCloud.
-
Fair & Transparent Compensation — Feedback suggests compensation and benefits are viewed favorably overall, indicating competitive pay positioning for many roles.
-
Healthcare Strength — Job postings indicate medical, dental, and vision coverage as part of a comprehensive package in the U.S.
-
Equity Value & Accessibility — Listings highlight equity participation as a standard component, signaling accessible ownership opportunities for employees.
MontyCloud Insights
What We Do
MontyCloud is a Seattle, WA based intelligent Cloud Management Platform Company. Our customers use MontyCloud DAY2™ to instantly close the cloud skills gap, simplify CloudOps, and reduce the total cost of cloud operations up to 70%, all in just a few clicks. By leveraging the AWS public cloud, AI, and ML, DAY2 ™ simplifies provisioning, security, compliance, cost optimization, and routine operations. DAY2™’s automation first, No-Code approach helps customers immediately derive deep insights and deliver intelligent Cloud Operations in just a few minutes. You can try the platform for free at https://MontyCloud.com







