Key Responsibilities
- Architect and implement scalable, fault-tolerant data pipelines using AWS Glue, Lambda, EMR, Step Functions, and Redshift
- Build and optimize data lakes and data warehouses on Amazon S3, Redshift, and Athena
- Develop Python-based ETL/ELT frameworks and reusable data transformation modules
- Integrate multiple data sources (RDBMS, APIs, Kafka/Kinesis, SaaS systems) into unified data models
- Lead efforts in data modeling, schema design, and partitioning strategies for performance and cost optimization
- Drive data quality, observability, and lineage using AWS Data Catalog, Glue Data Quality, or third-party tools
- Define and enforce data governance, security, and compliance best practices (IAM policies, encryption, access control)
- Collaborate with cross-functional teams (Data Science, Analytics, Product, DevOps) to support analytical and ML workloads
- Implement CI/CD pipelines for data workflows using AWS CodePipeline, GitHub Actions, or Cloud Build
- Provide technical leadership, code reviews, and mentoring to junior engineers
- Monitor data infrastructure performance, troubleshoot issues, and lead capacity planning
Required Skills & Qualifications
- Bachelor’s or Master’s degree in Computer Science, Information Systems, or related field
- 5–10 years of hands-on experience in data engineering or data platform development
- Expert-level proficiency in Python (pandas, PySpark, boto3, SQLAlchemy)
- AWS Glue, Lambda, EMR, Step Functions, DynamoDB, EDW Redshift, Athena, S3, Kinesis, Amazon Quicksight.
- IAM, CloudWatch, CloudFormation / Terraform (for infrastructure automation)
- Strong experience in SQL, data modeling, and performance tuning
- Proven ability to design and deploy data lakes, data warehouses, and streaming solutions
- Solid understanding of ETL best practices, partitioning, error handling, and data validation
- Hands-on experience in version control (Git) and CI/CD for data pipelines
- Knowledge of containerization (Docker/Kubernetes) and DevOps concepts
- Excellent analytical, debugging, and communication skills
Preferred Skills
- Experience with Apache Spark or PySpark on AWS EMR or Glue
- Familiarity with Airflow, dbt, or Dagster for workflow orchestration
- Exposure to real-time data streaming (Kafka, Kinesis Data Streams, or Firehose)
- Knowledge of Lake Formation, Glue Studio, or DataBrew
- Experience integrating with machine learning and analytics platforms (SageMaker, QuickSight)
- Certification: AWS Certified Data Analytics – Specialty or AWS Certified Solutions Architect
Soft Skills
- Strong ownership mindset with focus on reliability and automation
- Ability to mentor and guide data engineering teams
- Effective communication with both technical and non-technical stakeholders
Top Skills
What We Do
Egen is a data engineering and cloud modernization firm partnering with leading Chicagoland companies to launch, scale, and modernize industry-changing technologies. We are catalysts for change who create digital breakthroughs at warp speed. Our team of cloud and data engineering experts are trusted by top clients in pursuit of the extraordinary.
Our mission is to be an enabler of amazing possibilities for companies looking to use the power of cloud and data. We want to stand shoulder to shoulder with clients, as true technology partners, and make sure they succeed at what they have set out to do. We want to be disruptors, game-changers, and innovators who have played an important part in moving the world forward.








