Senior Data Architect (AI & AI-Assisted Development)

Reposted 5 Hours Ago
Be an Early Applicant
Hiring Remotely in Costa Rica
Remote
Senior level
Big Data • Software • Analytics
The Role
Lead design and implementation of scalable Cloudera-based data warehouse and lakehouse platforms, integrate AI/LLM and RAG workflows, build batch and near-real-time pipelines, define governance/security/quality standards, optimize SQL engines, mentor teams, and translate FP&A and Sales requirements into analytics and data products.
Summary Generated by Built In

Business Area:

IT

Seniority Level:

Mid-Senior level

Job Description: 

At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry.  Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.

We are seeking an experienced Senior Data Architect (AI-First Data Architecture & AI-Assisted Development) with 5+ years of experience designing scalable enterprise data platforms and enabling modern AI-driven ecosystems. The ideal candidate will bring deep expertise in data warehousing, lakehouse architectures, combined with hands-on experience in AI governance, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), semantic data architectures, and AI-assisted development practices.

This role extends beyond traditional data architecture by partnering with Business Intelligence, Data Science, Engineering, and AI teams to build AI-ready data foundations. The architect will lead the design of data models, metadata frameworks, and governance practices that optimize enterprise data for AI consumption, intelligent search, agentic workflows, and RAG-based applications. A key focus will be establishing robust metadata, business definitions, lineage, data tagging, and semantic structures to improve the accuracy, discoverability, and scalability of AI-powered solutions.

The successful candidate will drive AI-first data acquisition, curation, and governance strategies that support business intelligence, advanced analytics, and AI-driven decision-making across Finance, Sales, and other strategic business domains. They will also champion AI-assisted architecture and documentation practices to accelerate delivery, improve productivity, and create reusable patterns that enable both users and AI systems to effectively discover, understand, and leverage enterprise data.

This role will lead the evolution of intelligent, governed, and scalable data platforms that seamlessly integrate traditional data engineering with next-generation AI-powered capabilities, ensuring the organization's data ecosystem is optimized for the future of AI-enabled business operations.

As a Sr. Data Architect you will:

  • Design and implement scalable data warehouse and lakehouse architectures on the Cloudera platform.

  • Define enterprise data models, governance frameworks, data stewardship processes, security standards, and data quality practices.

  • Architect and optimize analytics solutions across SQL engines including Impala, Hive, and Iceberg.

  • Design AI-powered analytics solutions leveraging LLMs, Retrieval-Augmented Generation (RAG), vector databases (such as PostgreSQL, Qdrant, Milvus) , and NLQ capabilities.

  • Lead the integration of AI/ML capabilities into enterprise data platforms and data pipelines while establishing governance controls for AI models, data usage, and lifecycle management.

  • Leverage vibe coding / AI-assisted development tools to accelerate development and improve productivity.

  • Build and optimize batch and near real-time data pipelines.

  • Collaborate with business stakeholders to translate business requirements into scalable data products and analytics solutions.

  • Establish best practices for performance optimization, data architecture, and AI-assisted development.

  • Mentor teams on modern data architecture and AI-enabled development methodologies.

  • Ensure data security, governance, compliance, and responsible AI practices within enterprise data platforms and AI-enabled solutions.

  • Collaborate with business stakeholders across FP&A, Sales, and Revenue Operations to translate business requirements into scalable data solutions that support financial forecasting, revenue optimization, budgeting, pipeline analysis, and sales forecasting


We are excited about you if you have:

  • Bachelor’s degree in Computer Science or equivalent and 5-6 years of related experience; OR Master’s degree and 3-5 years of related experience; OR PhD and 0-3 years of related experience

  • Deep expertise in enterprise data warehousing, lakehouse architectures, and Cloudera-based data platforms.

  • Strong experience with CDP, including HDFS, Hive, Impala, Kudu, and Cloudera data ingestion and processing frameworks.

  • Strong understanding of distributed data systems and Hadoop-based architectures.

  • Advanced SQL skills, including performance tuning and query optimization.

  • Proficiency in Python and data engineering frameworks.

  • Experience with dimensional and normalized data modeling.

  • Strong understanding of data governance, lineage, metadata management, data cataloging, enterprise security, and compliance requirements.

  • Experience implementing AI governance practices including model governance, AI risk management, explainability, monitoring, and responsible AI controls.

  • Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production environments.

  • Familiarity with AI-assisted development tools (e.g., GitHub Copilot and LLM-powered workflows).

  • Strong communication, stakeholder management, and problem-solving skills.

  • Ability to align enterprise data architecture with business objectives in Finance, Sales, and Revenue Operations.

  • Ability to bridge traditional data platforms with modern AI capabilities

You might also have:

  • Experience with CDP Public Cloud and Private Cloud deployments.

  • Knowledge of Cloudera Data Warehouse (CDW), Cloudera Data Engineering (CDE), Kafka, Spark, and streaming architectures.

  • Experience with generative AI, vector databases, modern AI data architectures, and AI governance frameworks.

  • Understanding of Data Mesh, Data Fabric, and enterprise governance operating models.

  • Experience working with Salesforce, NetSuite, and other enterprise business systems.

  • Experience supporting FP&A, Sales Analytics, and executive reporting environments.

What Makes This Role Unique

  • Lead architecture for a modern Cloudera-centric enterprise data platform.

  • Drive the convergence of data warehousing, lakehouse architectures, and AI innovation.

  • Influence enterprise-wide data, governance, and AI strategy.

  • Champion AI-assisted development practices to accelerate productivity.

This role is not eligible for immigration sponsorship.

What you can expect from us:

  • Generous PTO Policy 

  • Support work life balance with Unplugged Days

  • Flexible WFH Policy 

  • Mental & Physical Wellness programs 

  • Phone and Internet Reimbursement program 

  • Access to Continued Career Development 

  • Comprehensive Benefits and Competitive Packages 

  • Paid Volunteer Time

  • Employee Resource Groups

EEO/VEVRAA

#LI-SZ1

#LI-REMOTE

Skills Required

  • Bachelor's degree in Computer Science or equivalent (or Master's/PhD with adjusted experience)
  • 5-6 years related experience (or 3-5 years with Master's, or 0-3 years with PhD)
  • Deep expertise in enterprise data warehousing and lakehouse architectures
  • Experience with Cloudera-based platforms and CDP (HDFS, Hive, Impala, Kudu)
  • Advanced SQL skills including performance tuning and query optimization
  • Proficiency in Python and data engineering frameworks
  • Strong understanding of distributed data systems and Hadoop-based architectures
  • Experience with dimensional and normalized data modeling
  • Knowledge of data governance, lineage, metadata management, and enterprise security
  • Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production
  • Familiarity with AI-assisted development tools and LLM-powered workflows (e.g., GitHub Copilot)
  • Strong communication, stakeholder management, and problem-solving skills
  • Ability to align enterprise data architecture with Finance, Sales, and Revenue Operations objectives
  • Experience with CDP Public Cloud and Private Cloud deployments
  • Knowledge of CDW, CDE, Kafka, Spark, and streaming architectures
  • Familiarity with Data Mesh and Data Fabric concepts
  • Experience working with Salesforce, NetSuite, or other enterprise business systems
  • Experience supporting FP&A, Sales Analytics, and executive reporting environments

Cloudera Compensation & Benefits Highlights

The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Cloudera and has not been reviewed or approved by Cloudera.

  • Leave & Time Off Breadth Time off includes generous PTO and holidays plus recurring company‑wide Unplugged Days that provide regular recharge time. Volunteer time off and flexible scheduling options further expand usable leave.
  • Healthcare Strength Health coverage spans comprehensive medical, dental, and vision alongside EAP, wellness sessions, and U.S. gym reimbursement. These elements position healthcare as a strong anchor within the package.
  • Strong & Reliable Incentives Compensation often includes variable incentives and long‑term incentive programs with annual bonuses commonly offered. Sales and other revenue roles show competitive on‑target earnings when goals are met, reinforcing the incentive structure.

Cloudera Insights

Am I A Good Fit?
beta
Get Personalized Job Insights.
Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company
HQ: Palo Alot, CA
3,092 Employees
Year Founded: 2008

What We Do

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community,

Similar Jobs

Movable Ink Logo Movable Ink

Front-end Engineer

Artificial Intelligence • Marketing Tech • Software
Easy Apply
Remote or Hybrid
Costa Rica
600 Employees

TransUnion Logo TransUnion

Platform Engineer

Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Remote or Hybrid
Heredia, Ulloa, Lagunilla, CRI
13000 Employees

TrueML Logo TrueML

Senior Software Engineer

Fintech • Machine Learning • Payments • Social Impact • Software • Financial Services
In-Office or Remote
3 Locations
450 Employees
75K-95K Annually

Akamai Technologies Logo Akamai Technologies

Site Reliability Engineer

Cloud • Security • Software • Cybersecurity
In-Office or Remote
2 Locations
10285 Employees
15M-32M Annually

Similar Companies Hiring

Hanover Park Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
42 Employees
Kepler  Thumbnail
Fintech • Software
New York, New York
6 Employees
Onshore Thumbnail
Artificial Intelligence • Fintech • Software • Financial Services
New York, New York
60 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account