Senior Platform Engineer - Product Reliability

Posted 12 Hours Ago
Be an Early Applicant
Melbourne, Victoria, AUS
Hybrid
Senior level
Software • Energy
The Role
As a Senior Platform Engineer in Product Reliability, you will enhance the performance and scalability of products, mentor teams on best practices, and drive reliability improvements in a cloud environment using technologies like AWS and Kubernetes.
Summary Generated by Built In
Help us use technology to make a big green dent in the universe!
 
Kraken powers some of the most innovative global developments in energy.
 
We’re a technology company focused on creating a smart, sustainable energy system. From optimising renewable generation, creating a more intelligent grid and enabling utilities to provide excellent customer experiences, our operating system for energy is transforming the industry around the world in a way that benefits everyone.
 
It’s a really exciting time in energy. Help us make a real impact on shaping a better, more sustainable future. 
 
Kraken Customer
 
What we do: build the most AI-driven, innovative, forward-thinking platform for energy management. From optimizing resources to delivering cost-effective, exceptional customer experiences through advanced Customer Information Systems (CIS), billing, meter data management, CRM, and AI-driven communications, Kraken is powering the next wave of innovation in the energy industry.
 
Why we do it: future energy will not look like energy as we know it today. We need to not just think about our future, but build for it. Now.

The Team
 
We have expanded our tentacles and are looking for someone (based in Melbourne, Australia or remote within Australia) to join our Global Platform Engineering Reliability - Product Reliability team.
 
Our Reliability group is responsible for architecting, developing, and maintaining the resilient and scalable infrastructure that powers and supports our platform.
 
As a Senior Product Reliability Engineer within the newly created ‘Product Reliability’ team, you'll be responsible for ensuring the availability, performance and scalability of the products on our platform.
 
Your proficiency in supporting products that serve millions of customers will ensure stability and high performance for our brands and clients.
 
You’ll keep up with best practices in building products for scale. Your communication skills and attention to detail will be indispensable as you pinpoint areas for enhancement, ensure optimal product performance and continuously improve our reliability and efficiency.

What you'll do:

  • Teach and support product teams on best practices for reliability, implementation patterns and effective usage of our existing platforms
  • Support product teams in improving the performance and availability of their systems
  • Be hands-on in code and infrastructure to help product teams with reliability improvements
  • Provide comprehensive feedback to the wider Platform group on improvements to be made to core infrastructure based on observations and first-hand experience in the code base
  • Support the build-out of proof-of-concept requirements in product teams as needed to evolve application deployment architecture to align with business growth as well as enhance scalability and system resilience
  • Collaborate with product teams to support the release of new features and services, ensuring adherence to reliability and performance standards
  • Guide product teams in designing systems for resilience and graceful failure under heavy load
  • Assist application teams with post-incident tasks and follow-ups, and contribute to the creation and review of post-mortem documentation
  • Analyse incident metrics to identify trends and potential improvements, communicating these insights to the product teams
  • Help solve interesting and difficult problems. There’s a great opportunity for disruption in the global energy market

What you'll have:

  • Great communication skills, working effectively with developers, product managers and other business stakeholders to understand, design and deliver impactful projects and reliability improvements
  • Solid hands-on experience across our core platform stack:
  • AWS (supporting and improving cloud infrastructure used by product teams)
  • Terraform (infrastructure as code; comfortable operating with Terraform day-to-day)
  • Kubernetes (container orchestration and deployment management; comfortable working with Kubernetes day-to-day)
  • Experience using industry-standard observability tooling - we use Datadog, Grafana, Prometheus and Rootly (experience with other monitoring/alerting platforms is transferable)
  • Strong collaboration and communication skills - able to work effectively with developers, product managers, and other stakeholders to design and deliver impactful observability “golden paths” and monitoring experiences
  • Exposure to Python (or a similar C-based language like TypeScript, Go, C#) - able to understand how applications behave in production to support observability and reliability improvements
  • Previous experience working in small, highly autonomous teams
  •  
    A working style that fits how we operate:
  • Comfortable with ambiguity and able to create structure in unclear situations
  • Proactive learning mindset (experiment, iterate, and adapt as the team evolves approaches)
  • Strong asynchronous written communication (Slack/Notion/docs) and a habit of keeping others in the loop
  • Autonomy and accountability - making progress independently and owning outcomes
  •  

What will help:

  • Previous experience as a Site Reliability Engineer
  • Experience working on SaaS platforms, including engaging product teams to ensure up-skilling and knowledge sharing across teams
  • Experience managing and supporting a large scale internet facing service
  • Experience in responding to incidents and outages, writing technical incident reports and organising incident retrospectives
  • Experience working with very large relational databases
  • Experience in using service level objectives to improve application performance
  • A proactive, innovative mindset

Kraken is a certified Great Place to Work in France, Germany, Spain, Japan, Australia, and USA. In the UK we are one of the Best Workplaces on Glassdoor with a score of 4.5. Check out our Welcome to the Jungle site (FR/EN) to learn more about our teams and culture.
 
Are you ready for a career with us? We want to ensure you have all the tools and environment you need to unleash your potential. If you have any specific accommodations or a unique preference, please contact us at [email protected] and we'll do what we can to customise your interview process for comfort and maximum magic!
 
Studies have shown that some groups of people, like women, are less likely to apply to a role unless they meet 100% of the job requirements. Whoever you are, if you like one of our jobs, we encourage you to apply as you might just be the candidate we hire. Across Kraken, we're looking for genuinely decent people who are honest and empathetic. Our people are our strongest asset and the unique skills and perspectives people bring to the team are the driving force of our success. As an equal opportunity employer, we do not discriminate on the basis of any protected attribute. We consider all applicants without regard to race, colour, religion, national origin, age, sex, gender identity or expression, sexual orientation, marital or veteran status, disability, or any other legally protected status. 
 
Our (i) Applicant and Candidate Privacy Notice and Artificial Intelligence (AI) Notice, (ii) Website Privacy Notice and (iii) Cookie Notice govern the collection and use of your personal data in connection with your application and use of our website. These policies explain how we handle your data and outline your rights under applicable laws, including, but not limited to, the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Depending on your location, you may have the right to access, correct, or delete your information, object to processing, or withdraw consent. By applying, you acknowledge that you’ve read, understood and consent to these terms
Please note that, in line with our current recruitment policy, we are unable to offer visa sponsorship for this position; applicants must have the right to work in the country that they're applying to, at the time of application.

Skills Required

  • Solid hands-on experience across AWS
  • Hands-on experience with Terraform
  • Experience in Kubernetes for container orchestration
  • Familiarity with observability tools like Datadog and Grafana
  • Exposure to Python or a similar C-based language
  • Experience working in small autonomous teams
  • Previous experience as a Site Reliability Engineer
  • Experience with SaaS platforms
  • Experience managing large-scale internet-facing services
  • Experience responding to incidents and outages
  • Experience working with large relational databases

Kraken Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Kraken and has not been reviewed or approved by Kraken.

  • Leave & Time Off Breadth Flexible working and policies such as flexible or unlimited leave alongside defined PTO and sick time are emphasized. This breadth allows time off to be tailored across regions.
  • Healthcare Strength Core coverage includes medical, dental, and vision in the US with company-paid life and AD&D, plus broad mental health and wellness supports. These provisions indicate robust health protection supplemented by wellbeing resources.
  • Equity Value & Accessibility An equity option scheme and stock options make ownership accessible, with some roles also able to take a portion of pay in digital assets. This expands potential long-term value beyond base salary.

Kraken Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: London
1,206 Employees

What We Do

Kraken delivers transformational tech to utilities around the world to make the global transition to green energy quicker and more affordable. Part of the Octopus Energy Group.

Similar Jobs

Ericsson Logo Ericsson

End to End Solution Lead - Mission Critical Networks

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office
5 Locations
88000 Employees

Pfizer Logo Pfizer

QA Systems Associate ( 2 year fixed term)

Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Remote or Hybrid
Victoria, AUS
121990 Employees

Square Logo Square

Strategic Account Manager

eCommerce • Fintech • Hardware • Payments • Software • Financial Services
Hybrid
Melbourne, Victoria, AUS
12000 Employees

Airwallex Logo Airwallex

Security Engineer

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
In-Office
Melbourne, Victoria, AUS
2200 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account