Platform Engineer (Cloud SRE Ops)

Reposted 5 Days Ago
Be an Early Applicant
Singapore, SGP
In-Office
6-6 Annually
Senior level
Information Technology • Security • Consulting • Cybersecurity
The Role
The Platform Engineer will build and operate automated monitoring systems, lead incident management, and ensure high availability of government services while managing projects and collaborating with a cross-functional team.
Summary Generated by Built In

In Digital Resiliency Engineering (DRE), we combine software and systems engineering to build and operate large-scale and distributed systems designed and/or built by the Singapore Government.  We ensure Government services are reliable, meets expected performance and satisfy customer needs.

If you are someone with strong DevOps, Infrastructure engineering and/or SRE background, have experience operating mission critical production technology infrastructure at scale, and are looking for opportunities to work with a team of practitioners and leading industry experts, we welcome you to join us.


In this role, you will build central services for observability and automation of infrastructure services. You will be part of a rotation with other engineers in providing rapid response to major incidents impacting critical Government Services. You will provide technical leadership for the team and work closely with technical leads to operate highly available solutions. You will also provide guidance to other team member on managing availability and performance of mission critical services, building automation and monitoring solutions to prevent problem recurrence, and building automated responses for non-exceptional service conditions.


You will also manage execution of project priorities, deadlines and deliverables. You will also lead designs of major components, systems and features to improve availability, scalability, latency and efficiency of services design and built by the Government.  

Key Responsibilities:

  • Build Service Level Indicators (SLI), Service Level Objective (SLO), Error Budgets, and Post-mortem Incident processes.
  • As part of an on-call roster, ensure reliability and performance of critical Government Services. Provide operational support and engineering for large-scale and distributed systems to drive incidents resolution effectively.
  • Gather and analyse metrics and logs from Operating Systems and/or applications for capacity planning, performance tuning and fault isolation.
  • Build automation to manage services, infrastructure, and/or applications.
  • Improve reliability and quality of services using proactive monitoring.
  • Measure and optimize system performance, with continuous improvement and pushing SRE practice forward.
  • Build SRE playbook for the Whole-of-Government to leverage as reference for SRE.
  • Identify potential and emerging technologies relevant to innovation for the Government.
  • Work in a cross-functional service team consisting of software engineers, infrastructure engineers, DevOps, and other specialists.


Requirements
  • 6+ years of experience in technology operations as an Infrastructure Engineer or Site Reliability Engineer - with experience operating large-scale mission critical production systems.
  • Expertise in building and operating automated monitoring and incident detection systems, creating runbooks and running incident management processes.
  • Expertise in designing automation solutions using provisioning tools, continuous integration tools (CI/CD), and scripting languages.
  • Experience leading highly complex technical projects with multiple dependencies and stakeholders
  • Knowledgeable and experienced in working within an Agile development environment, focusing on dynamic and rapid quality delivery.
  • Proficient in building and managing highly available and scalable IT infrastructure and/or application, with knowledge in Container and Virtualization technologies.
  • Proficiency in Python, PowerShell, or Ruby.
  • Proficiency with Infrastructure as Code (IaC) tools such as SaltStack, Puppet, Terraform, or Ansible.
  • Able to work independently and deliver results within specified deadlines.
  • Ability to prioritize work and strong problem-solving skills.
  • Good to have communicate skills, both verbally and in writing to users, vendors and management.
  • Ability to communicate complex interaction concepts clearly and persuasively across different audience and varies levels in GovTech.

Join us and discover a meaningful and exciting career with Assurity Trusted Solutions!


The remuneration package will commensurate with your qualifications and experience. Interested applicants, please click "Apply Now".


We thank you for your interest and please note that only shortlisted candidates will be notified.


By submitting your application, you agree that your personal data may be collected, used and disclosed by Assurity Trusted Solutions Pte. Ltd. (ATS), GovTech and their service providers and agents in accordance with ATS’s privacy statement which can be found at: https://www.assurity.sg/privacy.html or such other successor site.


Benefits
  • A wholly-owned subsidiary of GovTech.
  • We promote a learning culture and encourage you to grow and learn.
  • Annual Leave Benefits with additional perks such as Family Care and Birthday Leave.
  • Contract Staff enjoys the same benefits as Permanent Employees.

Skills Required

  • 6+ years of experience in technology operations as an Infrastructure Engineer or Site Reliability Engineer
  • Expertise in building and operating automated monitoring and incident detection systems
  • Experience leading highly complex technical projects with multiple dependencies and stakeholders
  • Knowledgeable and experienced in working within an Agile development environment
  • Proficient in building and managing highly available and scalable IT infrastructure
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
241 Employees
Year Founded: 2010

What We Do

Assurity Trusted Solutions (ATS) is a wholly owned subsidiary of the Government Technology Agency (GovTech). As a trusted partner, ATS offers a comprehensive suite of products and services ranging from infrastructure and operational services, governance and assurance services, as well as managed processes.

Similar Jobs

Mastercard Logo Mastercard

Manager, Strategy Pricing & Interchange

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
Hybrid
Singapore, SGP
38800 Employees
In-Office or Remote
Singapore, SGP
125 Employees

Wise Logo Wise

(Senior) FinCrime Product Compliance Manager - APAC Wise Platform

Fintech • Mobile • Payments • Software • Financial Services
Hybrid
Singapore, SGP
9000 Employees

Micron Technology Logo Micron Technology

Package Silicon Technology Node Development Director/DMTS

Artificial Intelligence • Hardware • Information Technology • Machine Learning
In-Office
Singapore, SGP
45000 Employees
75K-200K Annually

Similar Companies Hiring

Standard Template Labs Thumbnail
Artificial Intelligence • Information Technology • Software
New York, NY
25 Employees
Milestone Systems Thumbnail
Artificial Intelligence • Security • Software • Analytics • Big Data Analytics
Lake Oswego, OR
1500 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account