Join our dynamic team at the forefront of cutting-edge technology as we seek a seasoned Senior Data Engineer. Embark on a journey where your deep-rooted expertise in computer science fundamentals, alongside an intricate understanding of data structures, algorithms, and system design, becomes the cornerstone of innovative solutions. This pivotal role not only demands your proficiency in developing and elevating compute and I/O-intensive applications but also ensures their peak performance and unwavering reliability.
Responsibilities:
- Develop and implement real-time data ingestion and processing systems.
- Design, build, and operationalize large-scale enterprise data solutions and applications.
- Create and manage production data pipelines, from ingestion to consumption, within a big data architecture using PySpark.
- Leverage expertise in Python, Airflow, SQL, and cloud platforms to build robust solutions (strong knowledge of these is essential).
- Translate complex business challenges into scalable and efficient technical solutions.
- Work collaboratively with a high-performing data engineering team, owning the full lifecycle of solution implementation.
Requirements:
- Bachelor's Degree in Computer Science, Information Technology, or a similar discipline.
- 3 - 8 years of professional experience in data engineering or related fields.
- Proven experience with ETL processes, data integration, and handling large-scale datasets using PySpark.
- Strong understanding of data engineering concepts like ETL, near/real-time streaming, data structures, and workflow management.
- Experience with version control tools like Git/GitHub.
- Proficiency in SQL and Data Warehouse concepts.
- Experience working with SQL or NoSQL databases like Cassandra, MongoDB, or HBase.
- Knowledge of AWS technologies such as EMR, RedShift, Kinesis, Lambda, Glue, S3 IAM, CloudWatch, and big data tools like Hadoop/EMR, Hive, and Sqoop.
Top Skills
What We Do
Arcana enables institutional investors to understand their portfolio risks, decompose single stock & book performance, drill into crowding, and isolate their idiosyncratic differentiation. Built on our proprietary crowding, ownership, factor risk, and performance datasets.
The company's investors include D1 Capital, Duquesne (Stan Druckenmiller), Tiger Global, Abstract Ventures, GoldenTree Asset Management, Ryan Roslansky (CEO LinkedIn), and Akshay Kothari (COO Notion), among others.