Staff Software Engineer (Data Platform)
Join our dynamic team at the forefront of cutting-edge technology as we seek a seasoned Staff Data Engineer. Embark on a journey where your deep-rooted expertise in distributed systems, data architectures, and large-scale processing becomes the cornerstone of building high-performance data platforms. This pivotal role demands proficiency in designing and scaling compute and I/O-intensive data systems, ensuring reliability, efficiency, and cost optimization across the data lifecycle.
Responsibilities:
● Design and build scalable data platform components for batch and real-time data processing.
● Architect, develop, and operationalize large-scale data systems across ingestion, transformation, and serving layers.
● Build and manage robust data pipelines ensuring high reliability, scalability, and cost efficiency.
● Develop reusable frameworks and tooling to accelerate productivity for data engineers and data scientists.
● Leverage expertise in Python, Airflow, SQL, and cloud platforms to build production-grade data solutions.
● Optimize query performance and data models using strong understanding of columnar OLAP systems such as ClickHouse, Doris, and StarRocks.
● Implement streaming and near real-time data processing systems.
● Translate complex business requirements into scalable and efficient data platform solutions.
● Work collaboratively with cross-functional teams and provide technical leadership and mentorship.
● Drive architectural decisions by evaluating tradeoffs and selecting the right tools for the problem.
Requirements:
● Bachelor's Degree in Computer Science, Information Technology, or a similar discipline.
● 8+ years of professional experience in data engineering, backend systems, or distributed systems.
● Proven experience building scalable data platforms and large-scale data systems.
● Strong experience with ETL pipelines, data integration, and workflow orchestration systems such as Airflow or Temporal.
● Hands-on experience in Python and SQL with strong understanding of data warehouse concepts.
● Experience working with distributed OLTP/OLAP databases such as ClickHouse, PostgreSQL, Cassandra, or Elasticsearch.
● Knowledge of messaging and streaming systems such as Kafka.
● Experience with cloud platforms (AWS/GCP) and big data tools such as Spark.
● Strong understanding of columnar storage systems and query optimization techniques.
● Solid understanding of distributed systems fundamentals and associated tradeoffs.
● Experience working with containers and orchestration tools such as Docker and Kubernetes.
● Strong Linux fundamentals and system-level debugging skills.
● Familiarity with modern data architectures such as Lakehouse (Iceberg, Hudi, Delta) is a plus.
What We Do
Arcana enables institutional investors to understand their portfolio risks, decompose single stock & book performance, drill into crowding, and isolate their idiosyncratic differentiation. Built on our proprietary crowding, ownership, factor risk, and performance datasets. The company's investors include D1 Capital, Duquesne (Stan Druckenmiller), Tiger Global, Abstract Ventures, GoldenTree Asset Management, Ryan Roslansky (CEO LinkedIn), and Akshay Kothari (COO Notion), among others.


.png)






.png)