Rbatis + Turso: Building AI Agents & RAG Applications in Rust

โš“ Rust    ๐Ÿ“… 2026-05-10    ๐Ÿ‘ค surdeus    ๐Ÿ‘๏ธ 2      

surdeus

Rbatis + Turso: Building AI Agents & RAG Applications in Rust

Introduction

With the explosion of Large Language Models (LLMs), AI Agent and RAG (Retrieval-Augmented Generation) architectures have become the mainstream paradigm for building intelligent applications. At the heart of RAG is vector search โ€” transforming documents into vector embeddings, then retrieving the most relevant content via similarity search to help LLMs generate more accurate answers.

In the Rust ecosystem, Rbatis (a high-performance Rust ORM) and Turso (a ground-up Rust rewrite of SQLite with native vector search) offer a uniquely powerful combination: Rbatis delivers compile-time SQL generation for blazing performance, while Turso's native vector search eliminates the need for a separate vector database. This article explores how to apply Rbatis + Turso to AI Agent and RAG workloads.


Meet the Two Protagonists

Rbatis โ€” High-Performance Rust ORM

Rbatis is a Rust ORM framework based on compile-time code generation. Key features:

  • Dynamic SQL compiled to native Rust code โ€” hand-written SQL performance
  • Zero runtime overhead โ€” all SQL parsing and optimization happens at compile time
  • Multi-database support: MySQL, PostgreSQL, SQLite, Turso, DuckDB, and more
  • Multiple SQL building styles: py_sql, html_sql (MyBatis-like), raw SQL
  • Complete CRUD macros โ€” auto-generate insert/update/delete/select with one line
[dependencies]
rbatis = { version = "4.8" }
rbdc-turso = { version = "4" }
rbs = { version = "4" }
serde = { version = "1", features = ["derive"] }
tokio = { version = "1", features = ["full"] }

Turso โ€” SQLite Rewritten in Rust, with Native Vector Search

Turso is a ground-up rewrite of SQLite in Rust. It is 100% SQLite-compatible while adding concurrent writes, vector search, cloud-native access, and replication. Its killer feature is native vector search โ€” no extensions or plugins required, works out of the box.

Turso supports multiple vector types:

Vector Type Function Precision per Dim Best For
Dense vector32 32-bit float Most ML Embeddings (OpenAI, Sentence Transformers)
Dense vector64 64-bit float Applications needing higher precision
Sparse vector32_sparse Non-zero values + indices only TF-IDF, Bag-of-Words
Quantized vector8 1 byte Large-scale search, ~4x compression
Binary vector1bit 1 bit Approximate NN search, ~32x compression

Supported similarity distance functions:

Function Description Best For
vector_distance_cos Cosine distance Text Embeddings, document similarity
vector_distance_l2 Euclidean (L2) distance Image embeddings, spatial data
vector_distance_dot Negative dot product Normalized embeddings, MIPS
vector_distance_jaccard Jaccard distance Sparse vectors, TF-IDF

Rbatis + Turso Basic Integration

Connection Configuration

The rbdc-turso driver supports three connection modes:

use rbatis::RBatis;
use rbdc_turso::TursoDriver;

#[tokio::main]
async fn main() -> Result<(), rbatis::Error> {
    let rb = RBatis::new();

    // Mode 1: In-memory database
    // rb.init(TursoDriver {}, "turso://:memory:")?;

    // Mode 2: Local file
    // rb.init(TursoDriver {}, "turso://target/rag.db")?;

    // Mode 3: Remote Turso database (production)
    let turso_url = std::env::var("TURSO_URL").unwrap_or_default();
    let turso_token = std::env::var("TURSO_TOKEN").unwrap_or_default();
    rb.init(
        TursoDriver {},
        &format!("turso://?url={}&token={}", turso_url, turso_token)
    )?;

    Ok(())
}

Entity & Table Definition

For RAG scenarios, we need to store document content and corresponding vector embeddings:

use serde::{Deserialize, Serialize};
use rbatis::crud;

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Document {
    pub id: Option<i64>,
    pub title: Option<String>,
    pub content: Option<String>,
    pub embedding: Option<Vec<u8>>,  // BLOB column in Turso for vectors
}

// Auto-generate CRUD methods
crud!(Document{});

/// SQL to create the documents table
const CREATE_DOCUMENTS_TABLE: &str = "
CREATE TABLE IF NOT EXISTS documents (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    title TEXT NOT NULL,
    content TEXT,
    embedding BLOB
);
";

AI Agent Scenario: From Knowledge Base to Intelligent Decisions

Scenario

Let's build a developer AI assistant Agent that needs to:

  1. Retrieve relevant information from a technical documentation library
  2. Reason based on retrieved results and answer developer questions
  3. Remember conversation history context

Step 1: Build the Vector Knowledge Base

use rbatis::RBatis;
use rbatis::rbdc::Error;

/// Initialize the document knowledge base table
pub async fn init_knowledge_base(rb: &RBatis) -> Result<(), Error> {
    rb.exec(CREATE_DOCUMENTS_TABLE, vec![]).await?;
    Ok(())
}

/// Insert a document with its vector embedding
pub async fn insert_document(
    rb: &RBatis,
    title: &str,
    content: &str,
    embedding: &[f32],  // Output from an embedding model
) -> Result<(), Error> {
    // Serialize f32 slice to BLOB format expected by Turso's vector32
    let embedding_blob = embedding
        .iter()
        .flat_map(|f| f.to_le_bytes())
        .collect::<Vec<u8>>();

    let doc = Document {
        id: None,
        title: Some(title.to_string()),
        content: Some(content.to_string()),
        embedding: Some(embedding_blob),
    };

    Document::insert(rb, &doc).await?;
    Ok(())
}

Note: Turso's vector32() function accepts a JSON array string. You can use vector32('[0.1, 0.2, ...]') directly in SQL, or serialize from Rust and write to the BLOB column.

Step 2: Vector Similarity Search

This is the critical point โ€” why must we write raw SQL here?

Rbatis's crud! macro generates methods like select_by_map that handle structured field queries (equality, LIKE, IN, greater-than, etc.). Turso's vector search calls database-specific functions:

vector_distance_cos(embedding, vector32('[0.1, 0.2, ...]'))

Such built-in database functions, computed columns, and sort expressions cannot be expressed through generic CRUD macros โ€” developers must hand-write SQL. This is not a limitation of Rbatis; it's the natural boundary of any ORM when facing database-specific features โ€” just like MySQL's MATCH ... AGAINST full-text search or PostgreSQL's earth_distance() geo-computation.

Fortunately, Rbatis provides raw_sql and html_sql so you can gracefully return to native SQL when needed:

Approach 1: raw_sql (simple queries)

/// Semantic search: find the most similar documents to a query vector
pub async fn semantic_search(
    rb: &RBatis,
    query_embedding: &[f32],
    limit: u32,
) -> Result<Vec<Document>, Error> {
    let query_vec_str = query_embedding
        .iter()
        .map(|f| f.to_string())
        .collect::<Vec<_>>()
        .join(",");

    let sql = format!(
        r#"
        SELECT
            id, title, content,
            vector_distance_cos(embedding, vector32('[{}]')) AS distance
        FROM documents
        ORDER BY distance
        LIMIT {}
        "#,
        query_vec_str, limit
    );

    let docs: Vec<Document> = rb
        .fetch_decode(&sql, vec![])
        .await?;

    Ok(docs)
}

Approach 2: html_sql (complex vector + condition queries, recommended for production)

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "https://raw.githubusercontent.com/rbatis/rbatis/master/rbatis-codegen/mybatis-3-mapper.dtd">
<mapper>
    <select id="search_by_vector">
        SELECT id, title, content,
               vector_distance_cos(embedding, vector32('[${query_vec}]')) AS distance
        FROM documents
        <where>
            <if test="category != ''">
                AND category = #{category}
            </if>
        </where>
        ORDER BY distance
        LIMIT #{limit}
    </select>
</mapper>
#[rbatis::html_sql("rag_search.html")]
impl Document {
    pub async fn search_by_vector(
        rb: &dyn rbatis::Executor,
        query_vec: &str,
        category: &str,
        limit: u32,
    ) -> rbatis::Result<Vec<(Document, f64)>> { impled!() }
}

Summary: The crud! macro handles 80% of routine operations; the remaining 20% (database-specific features like vector search) use raw SQL or html_sql. They complement each other.

vector_distance_cos returns a value between 0 and 2 โ€” lower means more similar. You can also use vector_distance_l2 (Euclidean) or vector_distance_dot (dot product).

Step 3: The Complete AI Agent Workflow

/// AI Agent Q&A workflow
pub async fn agent_answer(
    rb: &RBatis,
    user_question: &str,
    embedding_model: &dyn EmbeddingService,
    llm_service: &dyn LLMService,
) -> Result<String, Error> {
    // 1. Convert user question to a vector
    let query_embedding = embedding_model
        .embed(user_question)
        .await;

    // 2. Retrieve the most relevant documents from the knowledge base
    let relevant_docs = semantic_search(rb, &query_embedding, 5).await?;

    // 3. Build the prompt context
    let context: String = relevant_docs
        .iter()
        .filter_map(|d| d.content.as_deref())
        .collect::<Vec<_>>()
        .join("\n---\n");

    // 4. Call the LLM to generate an answer
    let prompt = format!(
        "Answer the question based on the following knowledge base:\n\nKnowledge Base:\n{}\n\nQuestion: {}\n\nAnswer:",
        context, user_question
    );

    let answer = llm_service
        .generate(&prompt)
        .await;

    Ok(answer)
}

RAG in Action: Complete Example

Project Structure

rag-demo/
โ”œโ”€โ”€ Cargo.toml
โ””โ”€โ”€ src/
    โ””โ”€โ”€ main.rs

Cargo.toml

[package]
name = "rag-demo"
version = "0.1.0"
edition = "2021"

[dependencies]
rbatis = { version = "4.9" }
rbdc-turso = { version = "4" }
rbs = { version = "4" }
serde = { version = "1", features = ["derive"] }
tokio = { version = "1", features = ["full"] }
reqwest = { version = "0.11", features = ["json"] }
serde_json = "1"
fast_log = "1.6"

Full Code

use rbatis::crud;
use rbatis::rbdc::Error;
use rbatis::RBatis;
use serde::{Deserialize, Serialize};

// ========== Entity Definitions ==========

#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Document {
    pub id: Option<i64>,
    pub title: Option<String>,
    pub content: Option<String>,
    pub embedding: Option<Vec<u8>>,
}

crud!(Document{});

// ========== Embedding Service ==========

/// A simple embedding service interface (OpenAI API as example)
pub struct OpenAIEmbedding {
    api_key: String,
    model: String,
}

impl OpenAIEmbedding {
    pub fn new(api_key: &str) -> Self {
        Self {
            api_key: api_key.to_string(),
            model: "text-embedding-3-small".to_string(),
        }
    }

    pub async fn embed(&self, text: &str) -> Vec<f32> {
        // Simplified โ€” in production, use reqwest to call OpenAI /embedding API
        println!("Embedding: {}", &text[..text.len().min(50)]);
        // Return a mock 1536-dim vector (text-embedding-3-small dimension)
        vec![0.1_f32; 1536]
    }

    pub async fn embed_batch(&self, texts: &[&str]) -> Vec<Vec<f32>> {
        let mut results = Vec::new();
        for text in texts {
            results.push(self.embed(text).await);
        }
        results
    }
}

// ========== Database Operations ==========

const CREATE_DOCUMENTS_TABLE: &str = "
CREATE TABLE IF NOT EXISTS documents (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    title TEXT NOT NULL,
    content TEXT,
    embedding BLOB
);
";

pub async fn init_db(rb: &RBatis) -> Result<(), Error> {
    rb.exec(CREATE_DOCUMENTS_TABLE, vec![]).await?;
    println!("[OK] Document table initialized");
    Ok(())
}

pub async fn add_document(
    rb: &RBatis,
    title: &str,
    content: &str,
    embedding: &[f32],
) -> Result<(), Error> {
    let blob = embedding
        .iter()
        .flat_map(|f| f.to_le_bytes())
        .collect::<Vec<u8>>();

    let doc = Document {
        id: None,
        title: Some(title.to_string()),
        content: Some(content.to_string()),
        embedding: Some(blob),
    };

    Document::insert(rb, &doc).await?;
    println!("[OK] Document stored: {}", title);
    Ok(())
}

pub async fn batch_add_documents(
    rb: &RBatis,
    docs: &[(&str, &str)],
    embed_service: &OpenAIEmbedding,
) -> Result<(), Error> {
    let titles: Vec<&str> = docs.iter().map(|(t, _)| *t).collect();
    let contents: Vec<&str> = docs.iter().map(|(_, c)| *c).collect();

    let embeddings = embed_service.embed_batch(&contents).await;

    for ((title, content), embedding) in docs.iter().zip(embeddings.iter()) {
        add_document(rb, title, content, embedding).await?;
    }
    Ok(())
}

/// Vector similarity search
pub async fn search_similar(
    rb: &RBatis,
    query_embedding: &[f32],
    limit: u32,
) -> Result<Vec<(Document, f64)>, Error> {
    let vec_str = query_embedding
        .iter()
        .map(|f| f.to_string())
        .collect::<Vec<_>>()
        .join(",");

    let sql = format!(
        r#"
        SELECT
            id, title, content, embedding,
            vector_distance_cos(embedding, vector32('[{}]')) AS distance
        FROM documents
        ORDER BY distance
        LIMIT {}
        "#,
        vec_str, limit
    );

    let result = rb.exec_decode::<serde_json::Value>(&sql, vec![]).await?;
    println!("Search results: {}", serde_json::to_string_pretty(&result).unwrap());

    Ok(vec![])
}

// ========== RAG Query Pipeline ==========

pub async fn rag_query(
    rb: &RBatis,
    question: &str,
    embed_service: &OpenAIEmbedding,
) -> Result<(), Error> {
    println!("\n=== RAG Query ===");
    println!("Question: {}", question);

    // 1. Embed the question
    let query_vec = embed_service.embed(question).await;
    println!("[1/3] Question vectorized");

    // 2. Retrieve similar documents
    let _results = search_similar(rb, &query_vec, 3).await?;
    println!("[2/3] Vector search complete");

    // 3. In production, pass results as context to an LLM
    println!("[3/3] Waiting for LLM to generate answer...");
    println!("=== End ===\n");

    Ok(())
}

// ========== AI Agent: Multi-turn with Memory ==========

/// Conversation message entity โ€” crud! macro auto-generates insert/insert_batch
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct ConversationMessage {
    pub id: Option<i64>,
    pub session_id: Option<String>,
    pub turn_index: Option<i64>,
    pub role: Option<String>,
    pub content: Option<String>,
    pub created_at: Option<String>,
}

crud!(ConversationMessage{});

pub struct ConversationMemory {
    pub messages: Vec<ConversationMessage>,
}

impl ConversationMemory {
    pub fn new() -> Self {
        Self { messages: vec![] }
    }

    pub fn add(&mut self, session_id: &str, turn_index: i64, role: &str, content: &str) {
        self.messages.push(ConversationMessage {
            id: None,
            session_id: Some(session_id.to_string()),
            turn_index: Some(turn_index),
            role: Some(role.to_string()),
            content: Some(content.to_string()),
            created_at: None,
        });
    }

    /// Get the last N turns of conversation context (for prompt construction)
    pub fn recent_context(&self, n: usize) -> String {
        self.messages
            .iter()
            .rev()
            .take(n * 2)
            .rev()
            .map(|m| format!("{}: {}", m.role.as_deref().unwrap_or(""), m.content.as_deref().unwrap_or("")))
            .collect::<Vec<_>>()
            .join("\n")
    }

    /// Persist conversation memory to Turso โ€” uses the crud! macro
    pub async fn save_to_db(&self, rb: &RBatis) -> Result<(), Error> {
        rb.exec(
            "CREATE TABLE IF NOT EXISTS rbatis_conversation_message (
                id INTEGER PRIMARY KEY AUTOINCREMENT,
                session_id TEXT,
                turn_index INTEGER,
                role TEXT,
                content TEXT,
                created_at TEXT DEFAULT (datetime('now'))
            )",
            vec![],
        )
        .await?;

        // Batch insert via crud! macro โ€” no manual SQL needed
        ConversationMessage::insert_batch(rb, &self.messages, 50).await?;
        Ok(())
    }

    /// Load conversation history from the database โ€” also uses crud! macro
    pub async fn load_from_db(rb: &RBatis, session_id: &str) -> Result<Self, Error> {
        let messages = ConversationMessage::select_by_map(
            rb,
            value! {"session_id": session_id},
        )
        .await?;
        Ok(Self { messages })
    }
}

pub async fn agent_chat(
    rb: &RBatis,
    session_id: &str,
    turn: i64,
    question: &str,
    memory: &mut ConversationMemory,
    embed_service: &OpenAIEmbedding,
) -> Result<String, Error> {
    // 1. Retrieve knowledge base
    let query_vec = embed_service.embed(question).await;
    let docs = search_similar(rb, &query_vec, 3).await?;

    // 2. Build context (retrieval results + conversation memory)
    let knowledge_context: String = /* extract from docs */ "Rbatis is a high-performance Rust ORM...".to_string();
    let memory_context = memory.recent_context(3);

    // 3. Construct the final prompt and send to LLM
    let _final_prompt = format!(
        "Knowledge Base Context:\n{}\n\nConversation History:\n{}\n\nUser Question: {}",
        knowledge_context, memory_context, question
    );

    // 4. Mock LLM response
    let answer = format!(
        "Based on the knowledge base, regarding '{}': this is an excellent question. Please refer to the Rbatis official documentation for details.",
        question
    );

    // 5. Update conversation memory using crud! macro
    memory.add(session_id, turn * 2, "user", question);
    memory.add(session_id, turn * 2 + 1, "assistant", &answer);

    Ok(answer)
}

// ========== Main ==========

#[tokio::main]
async fn main() -> Result<(), Error> {
    fast_log::init(fast_log::Config::new().console()).expect("log init failed");

    // Initialize Rbatis + Turso
    let rb = RBatis::new();
    rb.init(rbdc_turso::TursoDriver {}, "turso://target/rag_demo.db")?;
    println!("[OK] Connected to Turso database");

    // Initialize table structure
    init_db(&rb).await?;

    // Initialize embedding service
    let embed = OpenAIEmbedding::new("sk-your-api-key");

    // ===== Build the Knowledge Base =====
    let documents = vec![
        (
            "Rbatis ORM Introduction",
            "Rbatis is a high-performance Rust ORM framework based on compile-time code generation, supporting MySQL, PostgreSQL, SQLite, Turso, and more."
        ),
        (
            "Rbatis Dynamic SQL",
            "Rbatis supports py_sql and html_sql for dynamic SQL construction. html_sql resembles MyBatis XML template style."
        ),
        (
            "Turso Vector Search",
            "Turso natively supports vector search with types like vector32, vector64, vector8, and functions for cosine, Euclidean, dot product, and Jaccard distance."
        ),
        (
            "Rbatis + Turso Integration",
            "Through the rbdc-turso driver, Rbatis seamlessly connects to Turso databases, leveraging Turso's vector search to build RAG applications."
        ),
        (
            "AI Agent Architecture",
            "AI Agents use the ReAct pattern for reasoning and action. Combined with RAG, they can retrieve relevant information from knowledge bases to assist decision-making."
        ),
    ];

    batch_add_documents(&rb, &documents, &embed).await?;

    // ===== RAG Query Demo =====
    rag_query(&rb, "How to use Rbatis for vector search in Rust?", &embed).await?;

    // ===== AI Agent Multi-turn Conversation Demo =====
    let session_id = "session-001";
    let mut memory = ConversationMemory::new();

    let answer1 = agent_chat(
        &rb,
        session_id,
        1,
        "What databases does Rbatis support?",
        &mut memory,
        &embed,
    ).await?;
    println!("Agent: {}", answer1);

    let answer2 = agent_chat(
        &rb,
        session_id,
        2,
        "How do I use Turso's vector search?",
        &mut memory,
        &embed,
    ).await?;
    println!("Agent: {}", answer2);

    // Persist conversation memory โ€” all handled by crud! macro, no raw SQL
    memory.save_to_db(&rb).await?;
    println!("[OK] Conversation history saved to Turso (via crud! macro)");

    Ok(())
}

Architecture Overview

The complete architecture of Rbatis + Turso in an AI Agent / RAG system:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   User Application                    โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ AI Agent โ”‚  โ”‚RAG Query โ”‚  โ”‚ Multi-turn Chat   โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚              โ”‚                โ”‚
        โ–ผ              โ–ผ                โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Rbatis ORM                         โ”‚
โ”‚  Compile-time SQL ยท CRUD Macros ยท Dynamic SQL ยท Pool โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
                       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              rbdc-turso Driver                        โ”‚
โ”‚  Turso Native Protocol ยท Rust SQLite ยท Type Mapping  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                       โ”‚
                       โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Turso Database                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ Documents  โ”‚  โ”‚Vector Searchโ”‚  โ”‚Conversations   โ”‚  โ”‚
โ”‚  โ”‚ ยท title    โ”‚  โ”‚ vector32    โ”‚  โ”‚ ยท session_id   โ”‚  โ”‚
โ”‚  โ”‚ ยท content  โ”‚  โ”‚ ยท cos/L2   โ”‚  โ”‚ ยท turn_index   โ”‚  โ”‚
โ”‚  โ”‚ ยท embed    โ”‚  โ”‚ ยท dot/Jacc โ”‚  โ”‚ ยท role/content โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Best Practices

1. Vector Dimension Management

  • OpenAI text-embedding-3-small outputs 1536-dim vectors
  • OpenAI text-embedding-3-large outputs 3072-dim vectors
  • Turso supports up to 65536 dimensions, more than sufficient
  • A 1536-dim vector takes approximately 6 KB of storage per record

2. Embedding Generation Strategy

  • On document ingestion: Chunk documents (256โ€“512 tokens, 20โ€“50 token overlap), generate embeddings per chunk
  • On query: Generate an embedding for the user's question in real time

3. Performance Optimization

  • Turso vector search currently uses linear scan (no index)
  • For large datasets, use WHERE to pre-filter (e.g., by category, tags)
  • Consider vector8 quantization for ~4x compression with acceptable precision loss
  • Rbatis's compile-time SQL ensures zero overhead on the query itself

4. Production Considerations

  • Use environment variables for Turso URL and Token
  • Add TTL or periodic cleanup for conversation memory
  • Consider a caching layer to reduce Embedding API calls
  • Use connection pooling (FastPool by default)
// Production connection config
rb.init(
    rbdc_turso::TursoDriver {},
    &format!(
        "turso://?url={}&token={}",
        std::env::var("TURSO_URL").expect("TURSO_URL not set"),
        std::env::var("TURSO_TOKEN").expect("TURSO_TOKEN not set")
    )
)?;

Summary

The Rbatis + Turso combination brings unique value to AI applications in the Rust ecosystem:

Dimension Advantage
Performance Rbatis compile-time SQL + Turso native vector search โ€” true zero-cost abstraction
Simplicity No need for a separate vector database (Pinecone, Milvus, etc.). Data and vectors stay together
Uniformity ORM and vector search share the same database โ€” transactions and ACID guarantees come naturally
Maintainability Pure Rust, type-safe, compile-time checks, fewer runtime errors
Cost Turso offers a generous free tier, great for prototyping and small-scale production

For Rust developers, this means you can quickly build AI Agents and RAG applications with semantic understanding and retrieval capabilities, using minimal dependencies and a familiar toolset. Rbatis handles structured data via ORM; Turso handles unstructured semantic search via vector search โ€” together they provide a solid data foundation for AI applications.


Bonus: rbatis-py โ€” Rbatis from Python

If your AI Agent stack uses Python (LangChain, LlamaIndex, etc.) but you want Rbatis's performance and Turso's vector search, try rbatis-py.

rbatis-py is a Python binding for Rbatis (built with PyO3 / Maturin), giving you the same ORM capabilities from Python.

Installation

pip install rbatis-py

Requires Python โ‰ฅ 3.8.

When to use rbatis-py?

Scenario Recommendation Reason
Pure Rust project Use rbatis crate directly Zero extra overhead
Python AI framework + Rust performance rbatis-py LangChain/LlamaIndex ecosystem is in Python
Rapid prototyping / data science rbatis-py Python glue code is more flexible
Hybrid stack (Python orchestration + Rust core) rbatis-py Unified ORM layer, reduced cognitive load

Python RAG Example

import asyncio
import json
from rbatis_py import RBatis, Model

class Document(Model):
    __table__ = "documents"
    id: int | None = None
    title: str | None = None
    content: str | None = None
    embedding: bytes | None = None  # Turso BLOB

async def semantic_search(db: RBatis, query_vec: list[float], limit: int = 3):
    """Turso vector search โ€” raw SQL calling vector_distance_cos"""
    vec_str = ",".join(str(v) for v in query_vec)
    sql = f"""
        SELECT id, title, content,
               vector_distance_cos(embedding, vector32('[{vec_str}]')) AS distance
        FROM documents
        ORDER BY distance
        LIMIT {limit}
    """
    return await db.exec_decode(sql)

async def main():
    db = RBatis()
    await db.link("libsql://your-db.turso.io?token=YOUR_TOKEN")

    # --- Routine CRUD: built-in Model methods ---
    await Document.insert(db, {
        "title": "Rbatis Python Bindings",
        "content": "rbatis-py brings Rbatis performance to Python...",
        "embedding": None
    })

    rows = await Document.select_by_map(db, {"title": "Rbatis Python Bindings"})
    print(rows)

    # --- Vector search: same as Rust โ€” must write raw SQL ---
    fake_vec = [0.1] * 1536
    results = await semantic_search(db, fake_vec)

    for row in results:
        print(f"  [{row['distance']:.4f}] {row['title']}")

    await db.close()

asyncio.run(main())

The core pattern stays the same regardless of language:

  • Routine CRUD โ†’ Model.insert() / Model.select_by_map() (equivalent to Rust's crud! macro)
  • Vector search โ†’ raw SQL calling vector_distance_cos()
  • Transactions โ†’ async with db.begin_defer() for auto-commit/rollback

If you're using Python for AI Agent orchestration but want the performance of Rbatis on the database layer, rbatis-py is the ideal bridge. Install with pip install rbatis-py. Source: GitHub - rbatis/rbatis-py: rbatis-py ยท GitHub


References

1 post - 1 participant

Read full topic

๐Ÿท๏ธ Rust_feed