Signal Engineer

Reposted 23 Days Ago
Be an Early Applicant
Melbourne, Victoria, AUS
In-Office
Entry level
Artificial Intelligence • Information Technology • Software
The Role
The Signal Engineer will develop data pipelines for cleaning and processing large datasets for training a language model, ensuring high quality data through editorial judgment and engineering expertise.
Summary Generated by Built In
About the role

Matilda is Australia's LLM. What ends up in the corpus is what the model learns, so the quality of the data sets the ceiling on the quality of the model.

We're hiring a Signal Engineer to own that ceiling. You will build the pipelines that turn massive, messy, raw data into the dataset Matilda trains on. The work is part engineering, part editorial judgment, done in code.

A lot of the real gains in frontier models come from the data, and most of that work is underinvested in across the field. It is one of the highest-leverage places you can spend your time as an engineer.


What you'll work on

- Pipelines that ingest, clean, dedupe, filter, and score training data at TB to PB scale

- Quality classifiers and heuristics that separate useful data from the rest

- Dataset mixture design and experiments on what actually improves the model

- Tools to explore, sample, and audit what's in the corpus

- Close work with researchers and training engineers so data choices connect to model behaviour

What we're looking for

- Strong engineer. Python, data tooling, distributed processing, clean pipelines.

- High attention to detail. Small errors compound fast at this scale.

- Taste and judgment about what good training data looks like.

- Comfort working with very large, very messy datasets.

- Curiosity about how data shapes model behaviour.

- High learning velocity. You don't need a PhD or prior LLM experience.

Nice to have

- Experience with web-scale corpora or pretraining data pipelines

- Experience working with unstructured text data

- Familiarity with distributed data frameworks (Spark, Ray, or similar)

- Exposure to deduplication, quality classification, or tokenisation

Note

Full-time role based in Melbourne, working closely with our in-person team. At this time we are not able to offer visa sponsorship, so applicants must have existing and unrestricted work rights in Australia.

Skills Required

  • Strong engineering skills in Python and data tooling
  • Experience with distributed data processing
  • Ability to work with large, messy datasets
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Melbourne, VIC
13 Employees

What We Do

We create intelligent systems that understand context, anticipate needs, and turn ideas into action, unlocking entirely new ways for people to work and create. The future isn’t just about software that stores information. It’s about technology that thinks, adapts, and acts. We are pioneering the next generation of AI-powered, action-driven systems that amplify human capability, accelerate workflows, and make work feel effortless. We believe AI should do more than assist, it should empower. If you're passionate about building the next era of intelligent software, join us.

Similar Jobs

HiBob Logo HiBob

Sales Manager

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Australia
1350 Employees

HiBob Logo HiBob

Account Executive

HR Tech • Information Technology • Professional Services • Sales • Software
Remote or Hybrid
Australia
1350 Employees

Xero Logo Xero

Lead Software Engineer

Cloud • Fintech • Information Technology • Machine Learning • Software
Hybrid
2 Locations
4500 Employees

Ericsson Logo Ericsson

End to End Solution Lead - Mission Critical Networks

Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
In-Office
5 Locations
88000 Employees

Similar Companies Hiring

Golden Pet Brands Thumbnail
Digital Media • eCommerce • Information Technology • Marketing Tech • Pet • Retail • Social Media
El Segundo, California
178 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account