Software Engineer, Sensor Data Integration

Posted 7 Days Ago
San Francisco, CA, USA
In-Office
Mid level
Computer Vision • Machine Learning • Software
The Role
Build and maintain scalable pipelines to ingest, standardize, store, and serve large geospatial datasets (point clouds, imagery) for ML and product use. Implement CI/CD, automated checks, performance optimizations, agentic harnesses for dataset triage/patching, and collaborate with ML, product, and customers to integrate and troubleshoot data.
Summary Generated by Built In
The Role

At Mach9, Sensor Data Integration Engineers build the algorithms and pipelines that transform large-scale geospatial datasets into structured, accessible formats to power our survey product, Digital Surveyor. You’ll work with high-volume data sources — LiDAR-collected point clouds, on-road imagery, overhead aerial ortho photos — and own the systems that ingest, standardize and store them for our training and product use. Every single piece of data that our customers upload will pass through your systems first.

This role is ideal for an engineer who loves puzzle-hunting — reverse-engineering sparsely-documented formats, wrangling coordinate systems and transforms, hunting down strange camera projection issues.

You’ll sit at the divide between our customers and our product, making messy real-world sensor data trustworthy at scale. This role sits at the front of everything we do: our models are only as good as the data feeding them, and you'll be the one making that data trustworthy at scale.

Responsibilities
  • Develop and maintain scalable, reproducible workflows for ingesting and processing large volumes of point cloud, imagery, and geospatial data.

  • Convert datasets from various sensor providers into Mach9's standardized internal formats.

  • Build CI/CD pipelines and automated checks that guarantee the correctness and consistency of data pipelines, including regression detection on dataset processing.

  • Optimize processing performance, query speed, and storage efficiency across large geospatial datasets.

  • Work closely with the customer success team to efficiently resolve issues and unblock customer projects.

    • Build and maintain agentic harness for automated dataset triage and code patching. Automatically propose or apply fixes, and escalate when human judgment is needed.

  • Work closely with ML and product teams to make data readily usable for training, inference and visualization.

  • Work closely with customers and data-provider partners to facilitate data integration (with occasional travels).

  • Puzzle-hunting: work with data formats with sparse or missing documentation.

Requirements
  • Strong software development, problem-solving, and debugging skills, with hands-on experience building production systems in Python.

  • Solid foundation in distributed systems and parallel computing.

  • Comfort operating with ambiguity — able to dig into undocumented or messy data formats, reverse-engineer how they work, and make steady progress without a clear spec.

  • Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.

  • Strong communication and collaboration skills, with the ability to work across ML, product, and customer-facing teams.

  • Bachelor's degree in Computer Science, Engineering, or equivalent experience.

Bonus qualifications
  • Experience building agentic systems and setting up agent harnesses — orchestrating LLM-driven workflows for triage, debugging, or automated code patching.

  • Understanding of geospatial data formats (e.g., LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (e.g., GDAL, PDAL, untwine, laz-perf).

  • Expertise designing and managing data schemas and storage systems for geospatial data (e.g., Postgres/PostGIS, AWS S3).

  • Experience with large-scale data processing frameworks and cloud platforms (e.g., Spark, AWS Batch).

  • Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms).

  • Experience building data versioning, lineage, or artifact-tracking systems.

  • Experience operating data pipelines that feed ML training and inference.

  • Familiar with C++.

Skills Required

  • Hands-on experience building production systems in Python
  • Strong software development, problem-solving, and debugging skills
  • Solid foundation in distributed systems and parallel computing
  • Comfort operating with ambiguity and reverse-engineering undocumented or messy data formats
  • Experience building agentic systems and setting up agent harnesses (LLM-driven workflows for triage/patching)
  • Strong communication and collaboration skills across ML, product, and customer teams
  • Bachelor's degree in Computer Science, Engineering, or equivalent experience
  • Understanding of geospatial data formats (LAS/LAZ, COPC, E57, GeoTIFF, Shapefiles) and tooling (GDAL, PDAL, untwine, laz-perf)
  • Designing and managing geospatial data schemas and storage systems (Postgres/PostGIS, AWS S3)
  • Experience with large-scale data processing frameworks and cloud platforms (Spark, AWS Batch)
  • Familiarity with coordinate reference systems and transforms (CRS, WKT, pyproj, affine transforms)
  • Experience building data versioning, lineage, or artifact-tracking systems
  • Experience operating data pipelines that feed ML training and inference
  • Familiarity with C++
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: San Francisco, CA
24 Employees
Year Founded: 2021

What We Do

Mach9 is at the forefront of leveraging advanced machine learning and computer vision techniques to transform raw geospatial data into actionable insights to help civil engineers build and maintain infrastructure globally. Our first product, Mach9 Digital Surveyor, helps surveyors automatically extract features from large-scale imagery and 3D datasets over 30x faster than today's manual and labor-intensive drafting workflows, accelerating the development of cost-effective and sustainable transportation and utility infrastructure. Mach9 supports leading asset owners and engineering and construction organizations globally solve the toughest engineering design, mission planning, and asset management problems.

Similar Jobs

ServiceNow Logo ServiceNow

VP, HR - Global Finance & Strategy

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
29000 Employees
280K-475K Annually

ServiceNow Logo ServiceNow

Executive Assistant

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
Santa Clara, CA, USA
29000 Employees
127K-222K Annually

ServiceNow Logo ServiceNow

Creative Director

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
San Diego, CA, USA
29000 Employees
162K-223K Annually

ServiceNow Logo ServiceNow

Sales Executive

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Remote or Hybrid
San Francisco, CA, USA
29000 Employees
139K-230K Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account