Torc Robotics

Senior Autonomy Data Engineer

Posted 2 Days Ago

Hiring Remotely in Blacksburg, VA, USA

Remote or Hybrid

161K-193K Annually

Senior level

Artificial Intelligence • Automotive • Robotics • Software • Transportation

The Role

Design, build, and operate data infrastructure and pipelines to ingest, validate, curate, and serve large-scale autonomous vehicle sensor logs for model training, labeling, visualization, and QA. Collaborate with ML and autonomy teams, define data contracts, and mentor junior engineers.

Summary Generated by Built In

About the Company:

At Torc, we have always believed that autonomous vehicle technology will transform how we travel, move freight, and do business. A leader in autonomous driving since 2007, Torc has spent over a decade commercializing our solutions with experienced partners. Now a part of the Daimler family, we are focused solely on developing software for automated trucks to transform how the world moves freight. Join us and catapult your career with the company that helped pioneer autonomous technology, and the first AV software company with the vision to partner directly with a truck manufacturer. 

Meet The Team:

Torc is hiring a Senior Autonomy Data Engineer to design, build and operate the data infrastructure that powers our autonomy program. You will build the pipelines, storage systems, and tooling that turn raw vehicle sensor logs in to the curated, structured datasets that our perception, planning and simulation engineers depend on.

This is a high-ownership role on a lean team. Moving large scale sensor data reliably from vehicles operating in demanding environments and making it quickly available for model training is a difficult and high-impact problem to solve. You will work directly with ML engineers, autonomy developers and platform engineers to close this data loop.

What You'll Do

Data Lake and Ingestion Pipeline
- Own the design and organization of the program’s data lake, including schema definitions, partitioning strategy and metadata indexing.
- Design and maintain end-to-end pipelines that ingest high-bandwidth sensor logs from vehicles into cloud storage with high reliability and tolerant of ad-hoc and intermittent connectivity mechanisms.
- Develop data validation and integrity checks that can detect corrupted information, missing sensors, and inconsistent calibration prior to the data being processed by downstream systems.
- Implement retention, tiering and lifecycle policies for data to balance storage costs with development value.
Dataset Curation and Labeling Infrastructure
- Build tooling to query raw logs to produce curated training and evaluation datasets.
- Build automation to run cost-effective pseudo-labeling workflows at the scale of data ingest.
- Implement data quality and model performance metrics that are used to direct labeling effort toward the highest-value examples.
Autonomy Data Visualization
- Deploy and maintain data visualization tooling to support log review, annotation QA, and autonomy debugging workflows.
- Build integrations between the visualization tooling and the data lake so engineers can navigate from a dataset entry or model failure directly to the origin log data
- Work with autonomy engineers to define and surface custom visualization panels and implement metrics for analyzing unstructured operating environments.
- Build dashboards that provide the autonomy engineers visibility into data coverage by terrain type, operating environment and geographic region.
Cross-functional Collaboration
- Establish and document data contracts between the data services and model training consumers.
- Partner with perception, planning and embedded engineers across the data lifecyle: from shaping the logging schemas and collection triggers to defining the dataset interfaces that supply model training and evaluation.
- Define data engineering standards, best practices, and tooling choices for an innovative and fast-paced team.
- Contribute to the data roadmap and provide input to technical leadership on investment priorities.
- Mentor junior engineers and raise the team’s capabilities in data infrastructure scalability and operational hygiene.

What You’ll Need to Succeed:

Bachelor’s degree in Computer Science, Computer Engineering, Software Engineering, Electrical Engineering or a related field with 6+ years of data engineering experience or a Master’s with 4+ years.
Strong proficiency in Python and SQL, with demonstrated ability to build production-quality data pipelines
Deep experience with cloud data infrastructure (AWS preferred: S3, Glue Athena, redshift, or equivalent) and infrastructure-as-code tools (Terraform, Cloud Formation).
Solid understanding of data partitioning strategies and columnar storage formats (Parquet, Orc, etc.)
Experience building and operating data pipelines that process time-series and binary data.
Proven ability to evaluate and integrate open-source tooling when appropriate versus building from scratch.
Strong instincts for delivering data quality through first-class implementations of monitoring, validation and lineage tracking.

Bonus points!

Experience with autonomous vehicles, robotics, or other sensor-driven autonomous systems.
Deep experience with Foxglove or Rerun beyond basic playback, e.g. building custom extensions or integrating them into a structured log review or annotation QA workflow.
Familiarity with the MCAP CLI and/or python library and experience converting MCAP data to columnar data formats for further querying and processing.
Experience with data curation for ML training, e.g. diversity sampling, pseudo-labeling, and dataset versioning.

Perks of Being a Torc’r  

Torc cares about our team members and we strive to provide benefits and resources to support their health, work/life balance, and future. Our culture is collaborative, energetic, and team focused. Torc offers:  

A competitive compensation package that includes a bonus component and stock options
100% paid medical, dental, and vision premiums for full-time employees  
401K plan with a 6% employer match
Flexibility in schedule and generous paid vacation (available immediately after start date)
Company-wide holiday office closures
AD+D and Life Insurance

At Torc, we’re committed to building a diverse and inclusive workplace. We celebrate the uniqueness of our Torc’rs and do not discriminate based on race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, veteran status, or disabilities. Even if you don’t meet 100% of the qualifications listed for this opportunity, we encourage you to apply. 

Our compensation reflects the cost of labor across several geographic markets. Pay is based on a number of factors and may vary depending on job-related knowledge, skills, and experience. Torc's total compensation package will also include our corporate bonus and stock option plan. Dependent on the position offered, sign-on payments, relocation, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits.

Job ID: R-102765

Hiring Range for Job Opening

US Pay Range

$160,800—$193,000 USD

Skills Required

Bachelor's in Computer Science, Computer Engineering, Software Engineering, Electrical Engineering or related with 6+ years data engineering experience (or Master's with 4+ years).
Strong proficiency in Python and SQL and ability to build production-quality data pipelines.
Deep experience with cloud data infrastructure (AWS preferred: S3, Glue, Athena, Redshift) and infrastructure-as-code tools (Terraform, CloudFormation).
Solid understanding of data partitioning strategies and columnar storage formats (Parquet, Orc).
Experience building and operating data pipelines that process time-series and binary data.
Proven ability to evaluate and integrate open-source tooling versus building from scratch.
Experience delivering data quality through monitoring, validation, and lineage tracking implementations.
Experience with autonomous vehicles, robotics, or other sensor-driven autonomous systems.
Experience with Foxglove or Rerun beyond basic playback (custom extensions or integrations).
Familiarity with the MCAP CLI and/or Python library and converting MCAP to columnar formats.
Experience with data curation for ML training (diversity sampling, pseudo-labeling, dataset versioning).

View all jobs at Torc Robotics

View Torc Robotics Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Blacksburg, VA

500 Employees

Year Founded: 2005

What We Do

Torc Robotics is an independent subsidiary of Daimler Truck AG, a global leader and pioneer in trucking. Founded in 2005 at the birth of the self-driving vehicle revolution, we have 17 years of experience in pioneering safety-critical, self-driving applications. Torc offers a complete self-driving vehicle software and integration solution and is currently focusing on commercializing self-driving trucks.

Why Work With Us

Every Torc’r is unique. The traits that define and motivate us to save lives are what unite us. At Torc, we recognize that technical prowess is only part of the equation. Our team includes people with a consistent drive to accomplish great things. We look for those who don’t let ego get in the way of teamwork.