Granica

Software Engineer – Foundational Data Systems for AI

Reposted 25 Days Ago

Mountain View, CA

In-Office

140K-200K Annually

Senior level

Artificial Intelligence • Big Data • Cloud • Machine Learning • Software • Business Intelligence • Data Privacy

Better Data for Better AI

The Role

As a Lakehouse Core Engineer, you'll enhance data systems, optimize storage and compute, develop ACID transactions, and improve query performance, focusing on petabyte-scale efficiency.

Summary Generated by Built In

About Granica

Granica is an AI research and infrastructure company focused on reliable, steerable representations for enterprise data.

We earn trust through Crunch, a policy-driven health layer that keeps large tabular datasets efficient, reliable, and reversible. On this foundation, we’re building Large Tabular Models—systems that learn cross-column and relational structure to deliver trustworthy answers and automation with built-in provenance and governance.

The Mission

AI today is limited not only by model design but by the inefficiency of the data that feeds it. At scale, each redundant byte, each poorly organized dataset, and each inefficient data path slows progress and compounds into enormous cost, latency, and energy waste.

Granica’s mission is to remove that inefficiency. We combine new research in information theory, probabilistic modeling, and distributed systems to design self-optimizing data infrastructure: systems that continuously improve how information is represented and used by AI.

This engineering team partners closely with the Granica Research group led by Prof. Andrea Montanari (Stanford), bridging advances in information theory and learning efficiency with large-scale distributed systems. Together, we share a conviction that the next leap in AI will come from breakthroughs in efficient systems, not just larger models.

What You’ll Build

Global Metadata Substrate. Help design and implement the metadata substrate that supports time-travel, schema evolution, and atomic consistency across massive tabular datasets.
Adaptive Engines. Build components that reorganize data autonomously, learning from access patterns and workloads to maintain efficiency with minimal manual tuning.
Intelligent Data Layouts. Develop and refine bit-level encodings, compression, and layout strategies to extract maximum signal per byte read.
Autonomous Compute Pipelines. Contribute to distributed compute systems that scale predictively and adapt to dynamic load.
Research to Production. Translate new algorithms in compression and representation from research into production-grade implementations.
Latency as Intelligence. Design and optimize data paths to minimize time between question and insight, enabling faster learning for both models and humans.

What You Bring

Foundational understanding of distributed systems: partitioning, replication, and fault tolerance.
Experience or curiosity with columnar formats such as Parquet or ORC and low-level data encoding.
Familiarity with metadata-driven architectures or data query planning.
Exposure to or hands-on use of Spark, Flink, or similar distributed engines on cloud storage.
Proficiency in Java, Rust, Go, or C++ and commitment to clean, reliable code.
Curiosity about how compression, entropy, and representation shape system efficiency and learning.
A builder’s mindset—eager to learn, improve, and deliver features end-to-end with growing autonomy.

Bonus

Familiarity with Iceberg, Delta Lake, or Hudi.
Contributions to open-source projects or research in compression, indexing, or distributed systems.
Interest in how data representation influences AI training dynamics and reasoning efficiency.

Why Granica

Fundamental Research Meets Enterprise Impact. Work at the intersection of science and engineering, turning foundational research into deployed systems serving enterprise workloads at exabyte scale.
AI by Design. Build the infrastructure that defines how efficiently the world can create and apply intelligence.
Real Ownership. Design primitives that will underpin the next decade of AI infrastructure.
High-Trust Environment. Deep technical work, minimal bureaucracy, shared mission.
Enduring Horizon. Backed by NEA, Bain Capital, and various luminaries from tech and business. We are building a generational company for decades, not quarters or a product cycle.

Compensation & Benefits

Competitive salary, meaningful equity, and substantial bonus for top performers
Flexible time off plus comprehensive health coverage for you and your family
Support for research, publication, and deep technical exploration

At Granica, you will shape the fundamental infrastructure that makes intelligence itself efficient, structured, and enduring. Join us to build the foundational data systems that power the future of enterprise AI!

Top Skills

Apache Iceberg

Delta Lake

Java

Parquet

Scala

Spark

View all jobs at Granica

View Granica Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Mountain View, California

37 Employees

Year Founded: 2023

What We Do

Massive-scale data should be an asset — not a liability.

At Granica, we’re building a state-of-the-art AI efficiency, data optimization and compression platform designed to make cloud-scale data cheaper, faster, safer and more intelligent.

As enterprises generate and store unprecedented volumes of data, traditional infrastructure can’t keep up — costs explode, performance lags, and privacy becomes harder to enforce. Granica changes that. We sit beneath the lakehouse — optimizing the data itself through advanced lossless compression, intelligent data selection, and built-in privacy preservation. Our platform enables enterprises to cut cloud data costs by up to 80% while improving performance and reducing operational complexity.

This is deep infrastructure — already deployed in live production environments, operating across hundreds of petabytes of enterprise data.

Why Work With Us

We’re a tight-knit team combining -->

* Fundamental research in compression, data systems, and information theory
* World-class systems engineering across storage, cloud infrastructure, and security
* A shared obsession with performance, scale, and clean design