Take a look at the neural network framework I wrote, which is implemented in Rust + CUDA
⚓ Rust 📅 2026-05-01 👤 surdeus 👁️ 1Highlights
- Rust-first, not Rust-only implementation
- Rust owns the framework structure and most high-level logic.
- CUDA C++ is used for optional GPU acceleration.
- CPU-only builds remain available without the
cudafeature.
- Dynamic autograd built around tensor graph construction
- Module-style abstraction for model components
- Separated layers / ops / models for easier experimentation
- Flexible precision system
- parameter dtype
- runtime dtype
- activation dtype
- KV-cache dtype
- Quantization-aware loading
- load float weights normally
- quantize on load to
i8 - generate offline quantized safetensors
- CPU and CUDA execution paths with explicit kernel/backend work
- Hugging Face
tokenizersintegration - Safetensors support with memory-mapped and streamed loading modes
- Release profile tuned with
lto,panic = "abort", andstrip
2 posts - 2 participants
🏷️ Rust_feed