We are looking for you, if you:
- have experience in operating system administration (Linux or Windows),
- know key cloud proconcepts you can describe cloud-native,
- understand and have knowledge about other stack layers – Network, Virtualization, Middleware, Databases,
- have good understanding of programming (preferred languages: Python, PoweShell, Golang),
- know how to use IaC/orchestration/automation tooling like Azure Pipelines, Ansible, Terraform,
- can identify and automate infrastructural management tasks using best infra-as-code practice,
- know key reliability engineering framework practices, consumer engineering idea and acronyms like SLI, MTTR and BCM are not just a couple random letters glued together.
English level - B2.
You'll get extra points for:
- value your time and don’t log in to host to run commands – Infra as a Code is your creed,
- do not like solving Incidents you prevent them from happening,
- always be step ahead and use new technologies,
- energy and efficiency,
- being a problem solver, not a spotter,
- team player,
- working with minimum supervision.
Your responsibilities:
As the Site Reliability Engineering Department, we focus on four key topics:
- Run & Change,
- Enablement,
- Rapid Response,
- Education.
At your role you will mainly focus on:
- Implementation of reliability across global platforms & services, global supporting tooling and entities:
- Operating in strong cooperation with involved Enterprise Architects, other SREs & DevOps engineers,
- Implementing observability measures via respective tooling of our critical business services,
- Identifying service level objectives with associated indicators,
- Look for and elimination of manual and repetitive task (commonly known as toil,
- Planning and evaluating new releases of features within infrastructure environment (release trains).
- Later on, focus will also be on other practices e.g.:
- Mature major incident management process (major incident mgt, problem mgt, post-mortem & root-cause analysis),
- Mature capacity planning & forecasting practice,
- Mature reliability reporting,
- Introduction of Error budgeting,
- Knowledge management about spreading “reliability by design” concept and execution of all required reliability practices.
Information about the squad:
We are a Team of Infra admins who got tired of manual work and decided to move to Infra as a Code approach. We want to prevent, not repair and make our system Reliable. Taking best approach from Google and Microsoft we want to create Culture of SRE Engineering with focus on Design, Run Enable, Rapid Response, Educate and Review. Are you up for the challenge?
The role naming convention in the global ING job architecture will be “Engineer IV”.
Top Skills
What We Do
ING is a pioneer in digital banking and on the forefront as one of the most innovative banks in the world. As ING, we have a clear purpose that represents our conviction of people’s potential. We don’t judge, coach, or tell people how to live their lives. However big or small, modest or grand, we empower people and businesses to realise their vision for a better future. We made the promise to make banking frictionless, removing barriers to progress, and make people confident in their financial decisions. As a global bank we have a huge opportunity – and responsibility – to make an impact for the better. We can play a role by financing change, sharing knowledge, and innovating. Being sustainable is in all the choices we make—as a lender, as a partner and through the services we offer our customers