Trunk

Staff Data Engineer

Posted 5 Days Ago

San Francisco, CA

170K-210K Annually

Senior level

Software

The Role

As a Senior Data Engineer at Trunk, you will build scalable data pipelines, design efficient data storage solutions, and optimize distributed applications. You will work closely with customers to understand their needs and integrate AI models for analytics. The role requires a strong understanding of distributed systems and experience in building data applications.

Summary Generated by Built In

At Trunk, we're on a mission to empower growing software organizations to deliver high-quality software quickly. We understand the challenges of merge conflicts, poor code quality or consistency, flaky tests, and other distractions that can drain productivity and morale. Our unique approach enables engineering teams to stay focused on designing, implementing, and delivering software, leading to the creation of magical, high-quality projects and happier teams.

Our journey began in 2021, with our founders leveraging their experience from some of the world's largest and fastest-growing tech companies - Uber, Google, YouTube, and Microsoft. In 2022, we achieved a significant milestone by securing a $25M Series A funding led by Garry Tan at Initialized Capital (currently President of YC) and Peter Levine at a16z. This growth and recognition are a testament to our potential and the value we bring to the software development landscape.

We know the frustration of trying to deliver code while constantly being interrupted by slow CI, flaky tests, and fragile processes. At Trunk, we’re building the tools to bring the joy back to software development. We’re looking for entrepreneurial people who are passionate about solving these problems.

As a founding member of our Data Engineering team, you’ll leverage your technical expertise to build data pipelines for processing and storing the data generated by our customer's CI/CD and automated tests. You’ll also experiment with integrating AI models to drive analytics and insights for our customers. We're tackling challenging problems and need engineers who can operate well in ambiguity and develop great solutions.

As an engineering team, we thrive on our ability to move quickly and adapt as we learn. Quickly delivering value to customers and getting their feedback is critical to our success. Engineers will be able to work closely with customers to understand the nuances of their use cases. We value empathy, hard work, and collaboration.

Our data stack is constantly evolving, but built on the foundations of Python, PostgreSQL, Spark, TimescaleDB, AWS, Kubernetes, and AWS Glue.

What you'll do 🧑‍💻

Build fault-tolerant and scalable data pipelines
Design efficient data storage, collaborating with product engineers to create fast and reliable data-driven features
Debug, profile, and optimize distributed data-intensive applications to improve their latency, accuracy, resource consumption, and throughput
Design and build observability of data quality and accuracyIntegrate
ML models like Llama to analyze data and create features

We're looking for 🔎

10-12+ years of experience as a software engineer with a strong understanding of key concepts in distributed systems
10-12+ years of experience in building and deploying data applications, with a track record of regularly shipping new features
Fluency in at least two of these languages: Java/Scala/Kolin, Python, Go, Rust, or C++
Good understanding and practical experience with partitioning, replication, map-reduce, indexing, and CAP theorem
Experience with distributed storage systems (S3, HDFS, Hive, ClickHouse, Elastic, etc), distributed processing engines (Spark, etc), and message queues (Kafka, SQS, etc)
Passion for building large-scale ML applications and improving software engineers' productivity
Understanding of key concepts in natural language processing, machine learning, or statistical analysis

(Nice to have) Some experience with machine learning stack (pandas, PyTorch, numpy, sci-kit, transformers, etc)

What we offer 🎁

Unlimited PTO
Competitive salary and equity
Work-life balance
Flexibility to be fully or partly remote
Up to $200/month stipend for coworking space for remote folks
Few meetings, so you can ship fast and focus on building
One Medical membership on us!
Top-notch medical, dental, vision, short-term disability, long-term disability, and life insurance
All insurance is 100% company-paid ($0 premiums) for employees and highly subsidized for dependents
FSA, HSA with company contributions, and pre-tax commuter benefits
401(k) plan
Paid parental leave ( up to 12 weeks)

Our tech stack 💻

Frontend: Typescript, React, Redux, Next.js
Backend: Typescript, Node, AWS, CDK, k8s, gRPC
Observability: Prometheus, Grafana, Kiali, Jaeger
CI/CD: GitHub Actions
CLI/Daemon/LSP: C++20, Bazel
VSCode Extension: Typescript
General: GitHub, Slack, Linear, Slite

The salary and equity range for this role are: $200K - $245K and .3% - .5%.

Please note that the compensation range provided is a general guideline only and is subject to change based on location, qualifications, and experience.

Top Skills

C++

Java

Kotlin

Python

Rust

Scala

View all jobs at Trunk

View Trunk Profile

Report Job

The Company

HQ: San Francisco, California

50 Employees

On-site Workplace

Year Founded: 2021

What We Do

Trunk is a dev tools startup, redefining software development at scale. We aim to flatten the lost productivity curve that software projects suffer as they grow in scale and complexity. When a majority of your engineer's time is not spent on actual engineering, when the tax paid to land a new Pull Request is greater than the time to write the code - it's time for a new approach