MetaXuda: Metal GPU runtime for ML on Apple Silicon (1.1 TOPS with Tokio async)

⚓ Rust 📅 2026-01-18 👤 surdeus 👁️ 5

Info

This post is auto-generated from RSS feed The Rust Programming Language Forum - Latest topics. Source: MetaXuda: Metal GPU runtime for ML on Apple Silicon (1.1 TOPS with Tokio async)

Hey Rustaceans!

I built MetaXuda - a native GPU runtime for machine learning on Apple Silicon, entirely in Rust.

Motivation:
Got tired of "buy Windows for ML" advice. Most ML libraries are CUDA-only with zero macOS GPU support. Translation layers like ZLUDA add overhead, so I built from scratch using Metal.

Tech Stack:

Rust core with Tokio async runtime
Metal for GPU acceleration
PyO3 for Python bindings (cuda_pipeline.so)
Arrow-based in-kernel quantization
Multi-tier memory manager (GPU → RAM → SSD)

Performance:

1.1 TOPS throughput (95% of M3 Max theoretical peak)
230+ GPU operations (math, transform, ML primitives)
93.37% GPU utilization cap (prevents macOS starvation)
Zero race conditions via centralized scheduler

Architecture Highlights:

Migrated from sync → async (40+ iterations to get it right!)
Stream managers + thread-pool groups coordinated by scheduler
Handles 100GB+ workloads through intelligent memory tiering
CUDA-compatible API naming for library interop

Current Status:

Works with Numba (bypasses execution path)
pip install metaxuda
Toolkit integration (scikit-learn, XGBoost) coming next
CUDA API coverage still in progress

Known Challenges:

Apple's Metal stream limits are undocumented (reverse-engineered what I could)
Some intentional blocking favors stability over raw speed
~1-in-million scheduler notification misses (rare edge case)

Links:

GitHub: GitHub - Perinban/MetaXuda-: An Metal based Cuda Framework
PyPI: pip install metaxuda
Show HN discussion: MetaXuda – 1.1 Tops GPU Runtime for Apple Silicon ML (Rust and Metal) | Hacker News

Looking for feedback on:

Async scheduler design patterns (Tokio + Metal coordination)
Memory tier eviction strategies
Anyone hitting Apple GPU quirks I should know about?

License inquiries: p.perinban@gmail.com

Would love thoughts from the community, especially on the Rust/async architecture choices!

1 post - 1 participant

Read full topic

🏷️ Rust_feed

👍 󠁮󠁮󠁮󠁮 👎 󠁮󠁮󠁮󠁮