Data Engineer

Posted 21 Days Ago
Be an Early Applicant
Hiring Remotely in Slovenia
Remote
Mid level
Machine Learning • Software
The Role
As a Data Engineer, you'll create and support ETL pipelines, ensure data quality, manage schemas, and optimize Spark jobs, collaborating closely with ML teams.
Summary Generated by Built In
Who We Are

We are a diverse and focused team motivated to solve the hardest problems in the automotive industry through machine learning. We have developed an end-to-end machine learning platform to empower automakers to build safer, smarter, and personalized vehicles. Our platform increases connected vehicle data accessibility and actionability for automakers, their partners and their end-customers.


Who You Are

You are a thoughtful engineer. You understand the complexities of distributed systems and how to triage and solve issues that arise with them. Scalability is top of mind when designing any system or writing code. You believe building a better ETL system requires close collaboration with the machine learning and data science teams.

 

About the Role

As an early data engineer at Viaduct, an analytics and ML platform company, your work is critical to our success. You are responsible for ensuring data quality and reliability in every part of our ETL pipeline, from ingestion to client integrations. 

 

Responsibilities

  • Creating and supporting batch, incremental, and real-time  ETL pipelines
  • Standardizing ingestion, validation, and cleaning processes across clients
  • Automating data validation to increase data quality
  • Managing and evolving schemas in all parts of our pipeline
  • Monitoring, tuning, and optimizing Spark jobs
  • Becoming a domain-expert in connected vehicle data

About You

  • 3+ years as a data engineer
  • Experience as a tech lead or mentor
  • Proficiency in Python/Scala/Java/C++ and SQL
  • 3+ years of experience with Spark or equivalent technologies
  • 2+ years of experience with a workflow scheduler (Airflow, Prefect, Argo, etc) 
  • 2+ years of experience with distributed file-systems (HDFS, S3, etc)
  • Familiar with the tools in open-source data ecosystem (Apache, CNCF, etc)
  • Experience with incremental or real-time processing (Delta Lake, Apache Hudi, Kafka Stream, Spark Streaming, etc) 

Security and Privacy Responsibilities

  • Follow our policy and procedure documents related to security and privacy
  • Follow the security and privacy guidelines in the Employee Handbook
  • Participate in new hire and annual training for security and privacy
  • Treat data security and privacy as one of your primary job responsibilities
  • Report Security Incidents you discover as bugs
  • Get approval from the Security Team before adding new 3rd party software to our codebase
  • Explicitly consider security implications when doing PR reviews

Bonus

  • Experience with Kubernetes
  • Experience working with ML teams
  • Contributor to open source projects
  • Experience in the Automotive industry or a love of cars
  • Prior work in small, agile teams

Skills Required

  • 3+ years as a data engineer
  • Proficiency in Python/Scala/Java/C++ and SQL
  • 3+ years of experience with Spark or equivalent technologies
  • 2+ years of experience with a workflow scheduler (Airflow, Prefect, Argo, etc)
  • 2+ years of experience with distributed file-systems (HDFS, S3, etc)
  • Experience with incremental or real-time processing (Delta Lake, Apache Hudi, Kafka Stream, Spark Streaming, etc)
Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Menlo Park, CA
31 Employees

What We Do

www.viaduct.ai

Similar Jobs

MWDN Logo MWDN

Data Engineer

Information Technology • Consulting
Remote
Slovenia
143 Employees

Ruby Labs Logo Ruby Labs

Data Engineer

Information Technology • Software
Remote
15 Locations
28 Employees

Aleph Group, Inc Logo Aleph Group, Inc

Data Engineer

AdTech • Marketing Tech
Remote or Hybrid
Ljubljana, SVN
872 Employees

Teads Logo Teads

Data Engineer

AdTech • Artificial Intelligence • Digital Media • Marketing Tech
Remote or Hybrid
2 Locations
1840 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account