wgcDB: A Rust prototype of an adaptive multi-representation database inspired by a research paper
โ Rust ๐ 2026-03-16 ๐ค surdeus ๐๏ธ 1I've been experimenting with an idea from a database research paper (linked in repo):
what if a database could observe your queries and automatically build the right index
for each column, instead of forcing you to create them upfront?
So I built a minimal prototype in Rust to test the concept.
Core idea
- Data is split into micro-shards (time-based directories with CSV data)
- Each shard maintains a pool of lightweight representations (indexes)
- The system monitors query patterns and automatically builds:
- Mini B-trees for high-cardinality columns (e.g.,
user_id) - Bloom filters for low-cardinality columns (e.g.,
category)
- Mini B-trees for high-cardinality columns (e.g.,
- Queries pick the best available representation per shard at runtime
Current MVP (800 lines of Rust)
git clone https://github.com/guangdawang/wgcDB
cd wgcDB
cargo run
You'll see:
- Test data generation (3 shards, 1000 rows each)
- Queries that automatically trigger index builds after the 3rd access
- Each shard choosing the optimal representation
Background
This is a minimal implementation of ideas from this paper โ
just a proof-of-concept to validate the architecture. The full vision includes RL-based
scheduling, more representation types (columnar, vector indexes), and persistent storage.
Why I'm posting
I'm looking for:
- Feedback on the architecture design
- Ideas for better scheduling logic (currently just counter-based)
- Potential contributors who find this direction interesting
The code is intentionally simple โ great for learning both Rust and database internals!
Repo: GitHub - guangdawang/wgcDB ยท GitHub
1 post - 1 participant
๐ท๏ธ Rust_feed