Era4 develops, owns and operates AI infrastructure across the UK, powered by renewable energy. Converting legacy industrial and energy sites into modern data-centre facilities, Era4 is combining brownfield regeneration opportunities with cleaner, efficient, scalable compute capacity for healthcare, research, finance, enterprise, and public-sector organisations
Role Summary:
We are seeking a Technical Product Manager – AI Cloud Infrastructure to join our fast-scaling team. In this role, you will embed with engineering to act as the "First Customer," owning the continuous validation, reliability strategy, and technical documentation for our bare-metal, VM, Kubernetes, and ML infrastructure. By treating testability as a core feature and shadowing real-world workflows, you will ensure our compute platform handles the demands of advanced AI training and engineering workloads. This is an opportunity to join a mission-led AI business that is redefining infrastructure, intelligence, and impact for enterprise customers.
Key Responsibilities:
- Execute integration testing in staging environments, work closely with the platform engineers to build repeatable test frameworks, and shadow internal and external AI infrastructure engineers to translate their real-world usage patterns into automated in-house test cases.
- Establish strict quality gates, performance SLOs, and scheduling benchmarks that our compute and orchestration services must pass before production deployment.
- Review, refine, and author technical guides, API documentation, and CLI guides, using them as the blueprint to test the platform exactly as an external engineer would.
- Partner with software and platform engineers to design robust validation suites, anticipating complex edge cases and structural failure modes across bare-metal provisioning and Kubernetes cluster lifecycles.
Essential Experience:
- Technical familiarity with bare-metal infrastructure (e.g., PXE booting, IPMI/Redfish), virtualization layers (e.g., KVM), and container orchestration (Kubernetes or similar).
- Track record designing comprehensive test strategies, validation frameworks, and acceptance criteria for highly technical cloud-native, API, or infrastructure-as-a-service (IaaS) products.
- Analyse infrastructure services, CLIs, and APIs from a developer’s perspective to identify friction points, usability gaps, and reliability risks.
- Working knowledge of modern CI/CD pipelines, automated testing, and automation tooling (e.g., GitLab CI, GitHub Actions, Terraform, Ansible) to help engineering shape automated quality gates.
- Proven experience in a highly technical role embedded directly within a core infrastructure or platform engineering team.
One or more would be an advantage:
- Direct exposure to high-performance computing (HPC) setups, large-scale cluster scheduling (e.g., Slurm), or infrastructure optimized for heavy AI/ML training workloads.
- Experience using cloud observability, telemetry, and monitoring tools (e.g., Prometheus, Grafana, Datadog) to track and improve system reliability metrics.
- Experience writing or structuring technical documentation, API reference guides, and developer tutorials from scratch.
Why Join Era4:
You’ll be joining a mission-driven start-up building critical national infrastructure, where operational excellence directly enables growth. This role offers high visibility with leadership, real autonomy, and the chance to shape how a next-generation company operates at scale.
Diversity & Inclusion:
Era4 is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
Skills Required
- Technical familiarity with bare-metal infrastructure (PXE booting, IPMI/Redfish).
- Experience with virtualization layers (KVM).
- Experience with container orchestration (Kubernetes or similar).
- Track record designing comprehensive test strategies, validation frameworks, and acceptance criteria for cloud-native or IaaS products.
- Ability to analyse infrastructure services, CLIs, and APIs from a developer perspective to identify usability and reliability gaps.
- Working knowledge of modern CI/CD pipelines and automation tooling (e.g., GitLab CI, GitHub Actions, Terraform, Ansible).
- Proven experience embedded within a core infrastructure or platform engineering team.
- Direct exposure to HPC, large-scale cluster scheduling (e.g., Slurm), or AI/ML-optimized infrastructure.
- Experience using observability and telemetry tools (Prometheus, Grafana, Datadog).
- Experience writing or structuring technical documentation, API reference guides, and developer tutorials.
What We Do
Carbon3.ai is building the UK’s sovereign AI platform – secure, sustainable, and designed for real-world impact. AI growth demands are creating new challenges and compute power requirements are outpacing supply. At Carbon3.ai, we’re not just providing infrastructure, we’re building the foundations to overcome these challenges. We are an energy business transforming into the UK’s sovereign choice for AI. Vertically integrated from soil to software transforming legacy industrial sites into renewable powered AI data hubs. Designed, owned, and operated by Carbon3.ai, all infrastructure and data processing are located within the UK and fully subject to UK jurisdiction and regulatory oversight. We generate our own off-grid renewable power, providing low-cost, sustainable energy comparable to Nordic levels, making AI workloads both affordable and sustainable. We own 50+ sites across the UK and are rapidly scaling them into AI data centres, enabling high-density, low-latency, sovereign AI deployment at national scale. Whether you're training models, deploying intelligent agents, or building industry-specific solutions, Carbon3.ai accelerates your journey from concept to production. Backed by strategic partnerships with leading brands and robust investment, we’re building the infrastructure to power the UK’s most ambitious AI innovation – ensuring British enterprises can access world-class AI capabilities securely and sustainably.








