wgcDB: A Rust prototype of an adaptive multi-representation database inspired by a research paper

⚓ Rust 📅 2026-03-16 👤 surdeus 👁️ 1

Info

This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: wgcDB: A Rust prototype of an adaptive multi-representation database inspired by a research paper

I've been experimenting with an idea from a database research paper (linked in repo):
what if a database could observe your queries and automatically build the right index
for each column, instead of forcing you to create them upfront?

So I built a minimal prototype in Rust to test the concept.

Core idea

Data is split into micro-shards (time-based directories with CSV data)
Each shard maintains a pool of lightweight representations (indexes)
The system monitors query patterns and automatically builds:
- Mini B-trees for high-cardinality columns (e.g., user_id)
- Bloom filters for low-cardinality columns (e.g., category)
Queries pick the best available representation per shard at runtime

Current MVP (800 lines of Rust)

git clone https://github.com/guangdawang/wgcDB
cd wgcDB
cargo run

You'll see:

Test data generation (3 shards, 1000 rows each)
Queries that automatically trigger index builds after the 3rd access
Each shard choosing the optimal representation

Background

This is a minimal implementation of ideas from this paper –
just a proof-of-concept to validate the architecture. The full vision includes RL-based
scheduling, more representation types (columnar, vector indexes), and persistent storage.

Why I'm posting

I'm looking for:

Feedback on the architecture design
Ideas for better scheduling logic (currently just counter-based)
Potential contributors who find this direction interesting

The code is intentionally simple – great for learning both Rust and database internals!

Repo: GitHub - guangdawang/wgcDB · GitHub

1 post - 1 participant

Read full topic

🏷️ Rust_feed

👍 󠁮󠁮󠁮󠁮 👎 󠁮󠁮󠁮󠁮