LanceDB is a developer-friendly, open-source data lake for multimodal AI. From hyper-scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large-scale AI datasets, LanceDB is the best foundation for your AI application, and powers some of the most groundbreaking applications and challenging requirements today.
We’re looking for a Software Engineer focused on Vector Indexing to help build the next generation of vector-native data infrastructure. You’ll work on high-performance indexing and search systems at the core of LanceDB, enabling scalable similarity search, full-text search, and flexible indexing for the open-source and enterprise communities alike.
Designing, building, and maintaining core vector indexing and search components
Implementing GPU-accelerated indexing algorithms and performance optimizations
Maintaining and evolving vector index algorithms, including pruning, quantization, and graph-based methods
Developing and optimizing full-text search capabilities and integrations
Benchmarking, profiling, and tuning performance across varied workloads
Writing and maintaining documentation, benchmarks, tutorials, and blog posts to support and grow adoption
Engaging with the open-source community: reviewing contributions, triaging issues, and joining design discussions
Strong proficiency in Rust
Experience designing or implementing vector search or indexing algorithms (e.g., HNSW, IVF, PQ, quantization, clustering)
Proficiency in C for GPU-related development
Familiarity with GPU acceleration frameworks (CUDA, ROCm, etc.)
Demonstrated ability to benchmark, profile, and optimize system performance
Excellent written communication and documentation skills
Comfortable collaborating in open-source environments
Understanding of full-text search systems (Lucene, Elasticsearch, Tantivy, etc.)
Experience building or maintaining data systems, databases, or search engines
Familiarity with distributed systems and scale-out architecture
Background in web APIs, embedding serving, or real-time systems
Contributions to or maintenance of open-source projects
A key role shaping an open-source project with real production usage
Remote-first team with flexible hours
Competitive compensation, equity, and benefits
Generous learning budget and support for open-source contributions
LanceDB was created by experts with decades of experience building tools for data science and machine learning. From co-authors of pandas to Apache PMC members of HDFS, Arrow, and Delta, the LanceDB team has created open-source tools used by millions worldwide.
Top Skills
What We Do
LanceDB is a developer-friendly, open source database for multimodal AI. From hyper scalable vector search to advanced retrieval for RAG, from streaming training data to interactive exploration of large scale AI datasets, LanceDB is the best foundation for your AI application.








