What You'll Do
- Design, build, and maintain reliable data pipelines for ingestion, transformation, and distribution of identity-linked data, processing large volumes efficiently in production.
- Develop ETL/ELT workflows using distributed computing frameworks on cloud infrastructure, applying strong engineering judgment to architectural and implementation decisions within your scope.
- Design and build API-first services that expose processed identity data to internal teams and external consumers, with a focus on reliability, clear contracts, and ease of integration.
- Implement data quality validation, monitoring, and observability for the components you own, ensuring reliability and correctness at scale.
- Contribute to platform-grade, reusable components that enable downstream teams and support self-service consumption of Identity capabilities.
- Take end-to-end ownership of key components within identity resolution systems and drive their reliability, scalability, and evolution.
- Design and implement privacy-compliant data handling practices, applying GDPR, CCPA, and Samba’s data governance policies, including support for clean room and privacy-preserving data collaboration workflows.
- Engage cross-functional stakeholders, product, data science, and partner teams, to ensure the systems you build support all downstream use cases.
- Drive technical design for components within your scope, producing clear design documents, actively contributing to architecture discussions, and aligning the team on well-reasoned solutions.
- Conduct rigorous code reviews and uphold high standards for code quality, testability, and maintainability.
- Mentor engineers on the team through structured feedback, pairing, and design review.
- Collaborate across adjacent teams, understanding their constraints and requirements, and advocating for shared standards where applicable.
- Own the reliability of your components end-to-end, monitor their health, respond to incidents, and follow through on post-mortem improvements with rigor.
- Participate in on-call rotations and contribute actively to improving operational practices across the team.
- Drive improvements to CI/CD pipelines, deployment processes, and testing coverage for your team’s systems.
Who You Are
- 8+ years of professional software engineering experience, with a strong focus on data engineering, backend systems, or distributed data infrastructure.
- Proficient in Python and SQL; comfortable with JavaScript in full-stack or API contexts.
- Strong hands-on experience with distributed processing frameworks (e.g., Spark, Databricks, or equivalent) working with large-scale datasets in production.
- Practical experience with cloud platforms (AWS and/or GCP) and their core data services.
- Hands-on experience with workflow orchestration tools (Apache Airflow, dbt, Prefect, or equivalent).Strong familiarity with data warehousing and lakehouse technologies, including Snowflake.
- Solid understanding of data privacy regulations (GDPR, CCPA) and practical experience building compliant systems.
- Familiarity with platform-thinking and API-first service design, building components that are reusable and consumable by downstream teams.A clear communicator and cross-functional collaborator, able to articulate technical decisions, engage constructively in design reviews, and navigate complex stakeholder relationships outside your immediate team.
- An active mentor, you invest in others, give direct feedback, and care about raising the bar for the team as a whole.
- Experience with streaming data processing frameworks (e.g., Kafka, Flink, Spark Streaming, or equivalent).
- Experience incorporating AI and machine learning capabilities into production data workflows.
- Exposure to ad tech, identity resolution, data licensing, or digital media, familiarity with concepts such as device graphs, audience segmentation, or Measurement.
Top Skills
What We Do
Television remains a vibrant cultural influence and an essential source of entertainment and information worldwide. Tremendous growth in content choices, and viewing platforms that allow us to watch anything, anytime, on any screen, has actually made it harder for viewers to discover and keep up with all the great programming available. It’s also more competitive for content providers to keep your attention, and for marketers to make strong, measurable connections with their target consumers.
Technology that improves the viewing experience, enables content discovery, and addresses audience fragmentation across screens will strengthen television’s business model and relevance to consumers. Data is at the center of any solution to make TV better.
Samba TV's technology is built into Smart TVs and easily maps to smart phones and tablets. By recognizing what's on screen, Samba TV learns what viewers like and using machine learning algorithms, enables discovery of shows and actors in a whole new way. Likewise, our data and measurement products are transforming the way stakeholders across the media landscape are thinking about their business. Given the dramatic growth in streaming services, connected devices, time-shifting, and multi-screen viewership, our data products solve real problems and create a meaningful competitive advantage for our clients.






