AI Engineer, Dev Ops

Posted 6 Days Ago
Be an Early Applicant
Singapore, SGP
In-Office
Mid level
Artificial Intelligence • Healthtech • Information Technology • Biotech
The Role
Operate and maintain the Data Platform across GCP and AWS, ensuring reliability, security, cost and performance. Manage IaC and CI/CD, run incident response and post-mortems, operate central data management (pipelines, storage, catalogues, access controls), improve observability and automation, and apply AI tools to streamline ops and tooling.
Summary Generated by Built In

AI Singapore (AISG) is a national AI programme launched by the National Research Foundation (NRF), Singapore, to build and anchor deep national capabilities in AI.  AISG is supported through a government-wide partnership including the NRF, Ministry of Digital Development and Information (MDDI), Infocomm Media Development Authority (IMDA), Economic Development Board (EDB) and Enterprise Singapore (ESG). We bring together research institutions and the vibrant ecosystem of AI start-ups and companies to support impactful research, develop talent, and power Singapore's AI efforts.

We're looking for an AI Engineer, Dev Ops to join the Data Platform team within AI Products at AISG. The role is to keep the Data Platform and the central data management infrastructure that sits beneath it running reliably, securely, and efficiently. You'll own the day-to-day operations of cloud infrastructure across GCP and AWS, the CI/CD and observability stack that the engineering team relies on, and the data pipelines, storage layers, and access controls that move data through the platform. You'll also bring strong AI fluency to the role: we expect the engineer to have knowledge in using AI tools deliberately for ops work such as incident triage, runbook generation, log analysis, IaC drafting, and automation and to help the team raise its operational bar by codifying those practices.

This position will be hosted at the Nanyang Technological University (NTU) under VP (Artificial Intelligence & Digital Economy)’s office and we welcome you to join our community.

Duties & Responsibilities:

Platform operations and reliability

  • Own day-to-day operations of the Data Platform across GCP and AWS which includes environment health, capacity, performance, cost, security posture and define and uphold SLOs, alerting policies, and on-call practices.

  • Lead incident response: triage, mitigate, run blameless post-mortems, and drive preventive actions through to completion.

Central data management and data engineering ops

  • Operate the central data management infrastructure which includes data pipelines, storage layers, catalogues, access control, and lineage tooling and partnering with engineering team on reliability, throughput, cost, and data quality.

  • Implement and maintain backup, retention, disaster recovery, and data residency controls aligned to programme and funder requirements.

Infrastructure, CI/CD, and automation

  • Manage infrastructure-as-code (e.g. Terraform) across GCP and AWS, and maintain CI/CD pipelines, container build/registry workflows, and deployment automation so teams can ship safely and frequently.

  • Strengthen observability across the stack using logs, metrics, traces, dashboards and reduce toil by automating repetitive operational tasks.

AI-assisted ops and continuous improvement

  • Use AI tools (e.g. Claude, Copilot, Cursor) deliberately in your ops workflow for incident triage, log and metric analysis, runbook drafting, IaC generation and review, and on-call support and codify good patterns for the team.

  • Build lightweight internal tooling that leverages AI to reduce operational load (e.g. summarising incidents, suggesting remediations, generating change reports), and contribute to security, compliance, and access reviews (secrets management, IAM hygiene, vulnerability patching, audit-readiness).

Requirements:

You should be a hands-on Dev Ops engineer who is comfortable operating cloud infrastructure end-to-end, who understands data engineering well enough to keep central data systems healthy, and who actively uses AI tools to make ops work faster and more reliable.

The ideal candidate would have:

  • A degree in Computer Science, Information Technology, or equivalent.

  • At least 3–5 years of Dev Ops, SRE, or platform engineering experience, with a track record of operating production systems at non-trivial scale.

  • Hands-on experience operating workloads on both GCP and AWS including IaC (e.g. Terraform), containers and orchestration (e.g. Docker, Kubernetes), and managed services for compute, storage, and networking.

  • Strong cybersecurity knowledge and hands-on practice including IAM and least-privilege design, secrets management, network security, vulnerability and patch management, audit logging, and security incident response. Familiarity with relevant standards and frameworks (e.g. ISO 27001, NIST, OWASP) and with data protection requirements (e.g. Singapore PDPA, GDPR where applicable) is expected.

  • Working knowledge of data engineering and central data management including pipelines, storage formats, catalogues, access controls, and data quality/observability concepts.

  • Strong fundamentals in CI/CD, observability (logs/metrics/traces), and incident response.

  • Demonstrated use of AI tools (e.g. Claude, Copilot, Cursor) in your day-to-day engineering for code generation, review, debugging, and documentation with a clear sense of where they help and where they don't.

  • Solid scripting/programming skills (e.g. Python, Bash, Go) and comfort reading other people's code across the stack.

  • Strong communication skills and the ability to work with both technical and non-technical stakeholders across cultures and time zones.

  • Working knowledge of AI/ML evaluation, benchmarking, and/or data annotation workflows, and the tooling ecosystem around them (e.g. evaluation harnesses, annotation platforms).

  • Bonus: experience with multimodal data systems (text, audio, image, video); contributions to open-source AI/data tooling; experience operating within Singapore public sector or research programmes.

We regret to inform that only shortlisted candidates will be notified.

Hiring Institution: NTU

Skills Required

  • Degree in Computer Science, Information Technology, or equivalent
  • 3-5 years DevOps, SRE, or platform engineering experience operating production systems
  • Hands-on experience operating workloads on both GCP and AWS
  • Experience with infrastructure-as-code (e.g., Terraform)
  • Experience with containers and orchestration (Docker, Kubernetes)
  • Strong cybersecurity knowledge and hands-on practice (IAM, least-privilege, secrets management, vulnerability/patch management, audit logging, security incident response)
  • Familiarity with security/ privacy standards and frameworks (ISO 27001, NIST, OWASP, PDPA, GDPR)
  • Working knowledge of data engineering and central data management (pipelines, storage formats, catalogues, access controls, data quality/observability)
  • Strong fundamentals in CI/CD, observability (logs/metrics/traces), and incident response
  • Demonstrated use of AI tools (e.g., Claude, Copilot, Cursor) for code generation, review, debugging, documentation and ops workflows
  • Solid scripting/programming skills (Python, Bash, Go) and ability to read code across the stack
  • Strong communication skills and ability to work with technical and non-technical stakeholders across cultures and time zones
  • Working knowledge of AI/ML evaluation, benchmarking, and data annotation workflows and tooling
  • Experience with multimodal data systems (text, audio, image, video)
  • Contributions to open-source AI/data tooling
  • Experience operating within Singapore public sector or research programmes
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Singapore
10 Employees
Year Founded: 2020

What We Do

The Lee Kong Chian School of Medicine (LKCMedicine) trains doctors with a focus on patient-centered care, integrating precision medicine, Artificial Intelligence (AI) in healthcare, and medical humanities into its undergraduate medical degree program.

Similar Jobs

CSC Logo CSC

Corporate Secretarial, Assistant Manager

Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
Remote or Hybrid
2 Locations
8500 Employees

CSC Logo CSC

Accountant

Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
Remote or Hybrid
2 Locations
8500 Employees

CSC Logo CSC

Senior Transaction Manager (Legal)

Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
Remote or Hybrid
2 Locations
8500 Employees

CSC Logo CSC

Administrative Assistant

Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
Remote or Hybrid
2 Locations
8500 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account