We are seeking a Principal Software Developer – Data Architect to drive the technical vision and architectural
strategy of Caseware’s enterprise data platform, including the AI-Ready Data Platform. This role will define
the enterprise data architecture, patterns, and modeling standards that deliver trusted, governed, high-quality
data products forming a foundational data platform for our cloud offerings, enabling AI capabilities and secure
interoperability with customer systems, while powering analytics and strengthening our core products.
This role requires deep experience designing modern data platforms and practical familiarity with how data
supports AI workflows, including retrieval, search, grounding, and secure interoperability patterns. You will
apply this experience to build a data foundation that supports AI workflows and agentic capabilities, analytics,
and customer interoperability.
This is a key leadership role where you will act as a hands-on architect while mentoring the development team,
guiding the long-term technical vision, shaping enterprise data architecture standards across teams, and
contributing to crucial AI and data platform projects.
❗ This is a full-time permanent position
❗ This is a new vacancy
📍 Location: This is a hybrid role requiring the successful candidate to work 3 days a week in our Toronto office located at 351 King St E Suite 1100 Toronto ON.
What you will be doing:
strategy for a scalable, AI-Ready, enterprise data platform, including Sherlock modernization,
lakehouse architecture, data products, interoperability, and the patterns and capabilities needed to
support AI-Ready use cases.
standards, guardrails, and best practices for our foundational data platform, including Icebergbased lakehouse architecture, medallion patterns, ingestion, normalization, data quality, and
interoperability.
documentation, and prototyping, and mentor teams in responsible usage that improves design
quality, data discovery, and delivery effectiveness.
Sherlock products, schema modernization, and data model evolution.
adherence to high standards in data engineering practices, data modeling, data quality, and
platform architecture.
AI-Ready, and securely interoperable data proucts, including data contracts, ingestion and
normalization standards, and improving consistency and reuse across products
classification, retention, tenant isolation, and access controls for datasets and data products.
make it easy for teams to produce data products and integrate with the data platform.
traceability, data dictionary controls, freshness monitoring, and alerting, so data products are
reliable and audit-ready.
What you will bring:
technical leadership role, preferably as a Principal Developer or Data Architect.
lakehouse, medallion, and analytics patterns, ingestion from OLTP systems, ETL/ELT pipelines,
distributed processing with Spark, Trino, and delivering analytics and AI-Ready data lakes at scale, with
strong operational practices.
analysis, design exploration, documentation, prototyping, code assistance, and mentoring teams on
responsible, effective usage.
DocumentDB, MS SQL Server, DynamoDB, AWS ElastiCache for Redis, and Valkey; event streaming and
queueing using SNS/SQS. Postgres, pgvector, and Kafka or Pub/Sub are an asset.
Formation, OpenSearch Serverless, S3 Vector Storage, Iceberg, Lambda, Step Functions, EKS, ETL on
EMR, and EMR Serverless.
guiding teams in data models, storage and integration architectures, data contracts, data domain
taxonomy, schema and event versioning, and resolving performance and scale bottlenecks.
sourcing, and CDC/change tracking strategies, safe historical reprocessing patterns, and performance
optimization through query analysis, indexing, and partitioning.
controls for privacy, access, auditability, safe reuse, and operational guardrails for AI-Ready datasets
and data products.
including governed data access, tenant-aware controls, and safe integration patterns.
retrieval, RAG data workflows, and real-time/event-driven data flows that support AI integrations.
Knowledge Bases, vector retrieval, and RAG workflows is preferred.
standards, and influencing technical direction across multiple teams.
tooling to deliver scalable, resilient data platforms and pipelines.
leadership on technical strategy, trade-offs, and decisions.
Key Success Factors:
• Establish a solid technical strategy: Collaborate with data platform, product, and architecture
leadership to define the AI-Ready Data Platform’s technical direction, ensuring alignment with
business growth, scalability, and interoperability objectives.
patterns and modeling standards backed by reference documentation and architecture decision
records that teams can apply consistently.
and cross-product data architecture improvements, strengthening the foundation for AI
capabilities, interoperability, scalability, and performance.
practices in data modeling, data quality, governance, and operational excellence.
Technologies you’ll work with:
OpenSearch Serverless, S3 Vector Storage, EMR/EMR Serverless, Spark, Trino, MapReduce,
Iceberg, Lambda, Step Functions, EKS, SNS/SQS; MongoDB, Amazon DocumentDB, MS SQL
Server, Redis/Valkey; Java (Spring), Python.
AWS Knowledge Bases, MCP, embeddings, vector retrieval, and RAG.
Skills Required
- 10+ years of experience in software development and data engineering
- At least 5 years in a senior technical leadership role (Principal Developer or Data Architect)
- Deep experience designing modern data platforms on AWS (lakehouse, medallion patterns, ingestion, ETL/ELT, analytics at scale)
- Hands-on experience with distributed processing (Spark) and query engines (Trino)
- Hands-on experience with core data technologies: MongoDB, Amazon DocumentDB, MS SQL Server, DynamoDB, Redis/Valkey
- Hands-on experience with AWS data services: S3, Athena, Glue Catalog, Lake Formation, OpenSearch Serverless, EMR/EMR Serverless, Lambda, Step Functions, EKS, S3 Vector Storage
- Proven ability to architect and deliver scalable, reliable data systems, data modeling, storage and integration architectures
- Experience designing replication, CDC/change tracking, event sourcing, safe historical reprocessing, and performance optimization (partitioning, indexing)
- Experience defining data governance, privacy, access controls, tenant isolation, and auditability for data products
- Experience enabling secure interoperability patterns with customer systems and AI workflows (governed access, tenant-aware controls)
- Practical, hands-on use of AI tools to improve architecture and engineering workflows and mentoring teams on responsible usage
- Proficiency in Java (Spring) and Python
- Experience working with DevOps, CI/CD, infrastructure-as-code and operational tooling
- Familiarity with Postgres, pgvector, Kafka or Pub/Sub
- Familiarity with AI-ready data patterns and platform integration concepts (embeddings, vector retrieval, RAG, AWS Bedrock, AWS Knowledge Bases, MCP)
What We Do
Caseware is the leading global provider of cloud-enabled audit, financial reporting and data analytics solutions for accounting firms, corporations and government regulators. Caseware’s innovative tools and platforms help more than half a million customers in 130 countries work smarter, dig deeper and see further as they transform insights into impact.








