Sr. Data Engineer

Posted 5 Days Ago
Be an Early Applicant
Hiring Remotely in Costa Rica
Remote
Senior level
Big Data • Software • Analytics
The Role
Design and implement data architectures, develop data pipelines with AI orchestration, and empower users with self-service tools at Cloudera.
Summary Generated by Built In

Business Area:

IT

Seniority Level:

Mid-Senior level

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

Cloudera Data Engineers power our Analytics and AI/ML initiatives by building scalable, high-performance data pipelines. In this role, you will lead our transition from traditional manual coding to AI-orchestrated development (Vibe Coding), architecting next-gen data pipelines and GenAI applications at unprecedented speed.

You will focus on modernizing development workflows and building GenAI-powered self-service tools that empower the business to resolve data needs independently. By designing robust, AI-first data management processes on Cloudera’s native platform, you will ensure data integrity while creating a blueprint for both internal efficiency and external customer success.

As a Senior Data Engineer you will:

  • Design and implement robust system architectures for real-time, near real-time, and batch processing data flows to meet the operational demands of complex business systems.

  • Master "Vibe Coding" and AI-orchestrated development to accelerate the delivery of new data pipelines and GenAI applications, reducing the end-to-end development lifecycle from days to hours.

  • Design and deploy GenAI-powered "Self-Service" tools, including automated documentation generators and natural language interfaces, to empower business users and reduce routine engineering requests.

  • Standardize AI-first engineering workflows across the team to ensure high-quality, auto-validated, and well-documented code delivery.

  • Develop and implement data transformations to enrich and provision data, following established specifications and standards while utilizing AI-first workflows.

  • Partner with data owners to ensure seamless, reliable data ingestion for both traditional analytics and GenAI-powered applications.

  • Collaborate with Data Architects, Operational Architects, and Data Analysts to understand the data and operational requirements across different business units.

  • Implement monitoring and CI/CD automation processes to track data quality and ensure the reliability of AI-supported data services.

We are excited if you have (Required Experience):

  • 5+ years of experience as a Data Engineer.

  • Solid skills in System Design for diverse data architectures, including expert-level knowledge of batch processing and real-time/streaming processing.

  • Proven experience with AI-first approaches and "Vibe Coding," with a demonstrated ability to deliver production-ready data pipelines using AI orchestration rather than purely manual coding.

  • Deep proficiency with AI-assisted coding tools, including Cursor, GitHub Copilot, or Gemini, to modernize and accelerate engineering workflows.

  • Proficient in coding with Python (primary) and SQL, with experience in ETL and data processing.

  • Hands-on experience with Distributed Systems and Big Data technologies, including Spark and the Hadoop ecosystem (Hive, Impala, Kafka).

  • Proven proficiency in Data Modeling using industry best practices (e.g., Kimball, Inmon) to ensure data integrity.

  • Ability to monitor critical data pipelines for quality and resolve any issues effectively.

  • Strong communication skills, both written and verbal.

  • Education: Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related field (or equivalent experience).

You may also have:

  • Experience supporting GenAI applications from the Data Engineering side, including managing vector databases, RAG (Retrieval-Augmented Generation) pipelines, or LLM data orchestration.

  • Experience with Apache Airflow or Apache NiFi.

  • Expertise in optimizing data storage using HDFS/Parquet/Avro, Kudu, or HBase.

This role is not eligible for immigration sponsorship or relocation

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

#LI-MH2

#LI-REMOTE

Skills Required

  • 5+ years of experience as a Data Engineer
  • Solid skills in System Design for diverse data architectures
  • Proven experience with AI-first approaches and 'Vibe Coding'
  • Deep proficiency with AI-assisted coding tools
  • Proficient in coding with Python (primary) and SQL
  • Hands-on experience with Distributed Systems and Big Data technologies
  • Proven proficiency in Data Modeling using industry best practices
  • Strong communication skills, both written and verbal
  • Bachelor's degree in Computer Science or related field

Cloudera Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Cloudera and has not been reviewed or approved by Cloudera.

  • Leave & Time Off Breadth Time off includes generous PTO and holidays plus recurring company‑wide Unplugged Days that provide regular recharge time. Volunteer time off and flexible scheduling options further expand usable leave.
  • Healthcare Strength Health coverage spans comprehensive medical, dental, and vision alongside EAP, wellness sessions, and U.S. gym reimbursement. These elements position healthcare as a strong anchor within the package.
  • Strong & Reliable Incentives Compensation often includes variable incentives and long‑term incentive programs with annual bonuses commonly offered. Sales and other revenue roles show competitive on‑target earnings when goals are met, reinforcing the incentive structure.

Cloudera Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Palo Alot, CA
3,092 Employees
Year Founded: 2008

What We Do

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community,

Similar Jobs

Experian Logo Experian

Senior Data Engineer

Big Data • Marketing Tech • Analytics
Remote or Hybrid
Heredia, CRI
16292 Employees

Experian Logo Experian

Senior Data Engineer

Big Data • Marketing Tech • Analytics
Remote or Hybrid
Heredia, CRI
16292 Employees

Backblaze Logo Backblaze

Infrastructure Engineer

Cloud • Information Technology
Remote
4 Locations
363 Employees

Experian Logo Experian

Senior Data Engineer

Big Data • Marketing Tech • Analytics
Remote
Heredia, CRI
16292 Employees

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
31 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account