Business Area:
ITSeniority Level:
Mid-Senior levelJob Description:
At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
We are seeking an experienced Senior Data Architect (AI-First Data Architecture & AI-Assisted Development) with 5+ years of experience designing scalable enterprise data platforms and enabling modern AI-driven ecosystems. The ideal candidate will bring deep expertise in data warehousing, lakehouse architectures, combined with hands-on experience in AI governance, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), semantic data architectures, and AI-assisted development practices.
This role extends beyond traditional data architecture by partnering with Business Intelligence, Data Science, Engineering, and AI teams to build AI-ready data foundations. The architect will lead the design of data models, metadata frameworks, and governance practices that optimize enterprise data for AI consumption, intelligent search, agentic workflows, and RAG-based applications. A key focus will be establishing robust metadata, business definitions, lineage, data tagging, and semantic structures to improve the accuracy, discoverability, and scalability of AI-powered solutions.
The successful candidate will drive AI-first data acquisition, curation, and governance strategies that support business intelligence, advanced analytics, and AI-driven decision-making across Finance, Sales, and other strategic business domains. They will also champion AI-assisted architecture and documentation practices to accelerate delivery, improve productivity, and create reusable patterns that enable both users and AI systems to effectively discover, understand, and leverage enterprise data.
This role will lead the evolution of intelligent, governed, and scalable data platforms that seamlessly integrate traditional data engineering with next-generation AI-powered capabilities, ensuring the organization's data ecosystem is optimized for the future of AI-enabled business operations.
As a Sr. Data Architect you will:
Design and implement scalable data warehouse and lakehouse architectures on the Cloudera platform.
Define enterprise data models, governance frameworks, data stewardship processes, security standards, and data quality practices.
Architect and optimize analytics solutions across SQL engines including Impala, Hive, and Iceberg.
Design AI-powered analytics solutions leveraging LLMs, Retrieval-Augmented Generation (RAG), vector databases (such as PostgreSQL, Qdrant, Milvus) , and NLQ capabilities.
Lead the integration of AI/ML capabilities into enterprise data platforms and data pipelines while establishing governance controls for AI models, data usage, and lifecycle management.
Leverage vibe coding / AI-assisted development tools to accelerate development and improve productivity.
Build and optimize batch and near real-time data pipelines.
Collaborate with business stakeholders to translate business requirements into scalable data products and analytics solutions.
Establish best practices for performance optimization, data architecture, and AI-assisted development.
Mentor teams on modern data architecture and AI-enabled development methodologies.
Ensure data security, governance, compliance, and responsible AI practices within enterprise data platforms and AI-enabled solutions.
Collaborate with business stakeholders across FP&A, Sales, and Revenue Operations to translate business requirements into scalable data solutions that support financial forecasting, revenue optimization, budgeting, pipeline analysis, and sales forecasting
We are excited about you if you have:
Bachelor’s degree in Computer Science or equivalent and 5-6 years of related experience; OR Master’s degree and 3-5 years of related experience; OR PhD and 0-3 years of related experience
Deep expertise in enterprise data warehousing, lakehouse architectures, and Cloudera-based data platforms.
Strong experience with CDP, including HDFS, Hive, Impala, Kudu, and Cloudera data ingestion and processing frameworks.
Strong understanding of distributed data systems and Hadoop-based architectures.
Advanced SQL skills, including performance tuning and query optimization.
Proficiency in Python and data engineering frameworks.
Experience with dimensional and normalized data modeling.
Strong understanding of data governance, lineage, metadata management, data cataloging, enterprise security, and compliance requirements.
Experience implementing AI governance practices including model governance, AI risk management, explainability, monitoring, and responsible AI controls.
Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production environments.
Familiarity with AI-assisted development tools (e.g., GitHub Copilot and LLM-powered workflows).
Strong communication, stakeholder management, and problem-solving skills.
Ability to align enterprise data architecture with business objectives in Finance, Sales, and Revenue Operations.
Ability to bridge traditional data platforms with modern AI capabilities
You might also have:
Experience with CDP Public Cloud and Private Cloud deployments.
Knowledge of Cloudera Data Warehouse (CDW), Cloudera Data Engineering (CDE), Kafka, Spark, and streaming architectures.
Experience with generative AI, vector databases, modern AI data architectures, and AI governance frameworks.
Understanding of Data Mesh, Data Fabric, and enterprise governance operating models.
Experience working with Salesforce, NetSuite, and other enterprise business systems.
Experience supporting FP&A, Sales Analytics, and executive reporting environments.
What Makes This Role Unique
Lead architecture for a modern Cloudera-centric enterprise data platform.
Drive the convergence of data warehousing, lakehouse architectures, and AI innovation.
Influence enterprise-wide data, governance, and AI strategy.
Champion AI-assisted development practices to accelerate productivity.
This role is not eligible for immigration sponsorship.
What you can expect from us:
Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Paid Volunteer Time
Employee Resource Groups
EEO/VEVRAA
#LI-SZ1
#LI-REMOTE
Skills Required
- Bachelor's degree in Computer Science or equivalent (or Master's/PhD with adjusted experience)
- 5-6 years related experience (or 3-5 years with Master's, or 0-3 years with PhD)
- Deep expertise in enterprise data warehousing and lakehouse architectures
- Experience with Cloudera-based platforms and CDP (HDFS, Hive, Impala, Kudu)
- Advanced SQL skills including performance tuning and query optimization
- Proficiency in Python and data engineering frameworks
- Strong understanding of distributed data systems and Hadoop-based architectures
- Experience with dimensional and normalized data modeling
- Knowledge of data governance, lineage, metadata management, and enterprise security
- Experience implementing AI/ML, LLM, vector database, and RAG-based solutions in production
- Familiarity with AI-assisted development tools and LLM-powered workflows (e.g., GitHub Copilot)
- Strong communication, stakeholder management, and problem-solving skills
- Ability to align enterprise data architecture with Finance, Sales, and Revenue Operations objectives
- Experience with CDP Public Cloud and Private Cloud deployments
- Knowledge of CDW, CDE, Kafka, Spark, and streaming architectures
- Familiarity with Data Mesh and Data Fabric concepts
- Experience working with Salesforce, NetSuite, or other enterprise business systems
- Experience supporting FP&A, Sales Analytics, and executive reporting environments
Cloudera Compensation & Benefits Highlights
The following summarizes recurring compensation and benefits themes identified from responses generated by popular LLMs to common candidate questions about Cloudera and has not been reviewed or approved by Cloudera.
-
Leave & Time Off Breadth — Time off includes generous PTO and holidays plus recurring company‑wide Unplugged Days that provide regular recharge time. Volunteer time off and flexible scheduling options further expand usable leave.
-
Healthcare Strength — Health coverage spans comprehensive medical, dental, and vision alongside EAP, wellness sessions, and U.S. gym reimbursement. These elements position healthcare as a strong anchor within the package.
-
Strong & Reliable Incentives — Compensation often includes variable incentives and long‑term incentive programs with annual bonuses commonly offered. Sales and other revenue roles show competitive on‑target earnings when goals are met, reinforcing the incentive structure.
Cloudera Insights
What We Do
At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community,









