Announcing FluxBench 0.1.0: A Crash-Resilient Benchmarking Framework with Native CI Support
โ Rust ๐ 2026-02-13 ๐ค surdeus ๐๏ธ 1Iโm really happy to share the first release of FluxBench, a benchmarking framework built to make performance testing as reliable and automated as your unit tests.
Why I built this
I love improving my code performance, but I found that as my projects grew, maintaining benchmarks became a project in itself. I often spent time writing boilerplate just to extract inputs, and if a single benchmark hit an edge case and panicked, it would crash the entire suite.
I wanted a workflow where I could focus entirely on my library logic, while the tooling handled the stability and regression checking. I wanted to be able to say "ensure this version isn't slower than the last one" directly in the code, rather than manually comparing numbers in logs.
That desire for a smoother, more "set-and-forget" experience led to FluxBench.
What makes it different?
-
It keeps running: FluxBench uses a Supervisor-Worker architecture. Every benchmark runs in its own process. If one panics, the supervisor catches it, records the error, and moves on to the next one. Your suite always finishes, giving you the data you need to fix the issue.
-
Performance as Code: You can write logic to verify your performance directly in the benchmark.
#[verify(expr = "new_impl < old_impl", severity = "critical")] struct RegressionCheck; -
Less Duplication, Cleaner Functions: While benchmarking always requires some setup, I wanted to drastically cut down on the repetitive boilerplate. The macro handles the grouping and wiring, so your function stays focused on exactly what you want to measure:
#[bench(group = "parsing")] fn my_algo(b: &mut Bencher) { b.iter(|| expensive_operation()); } -
Automated CI checks: Instead of checking regressions manually, FluxBench can save a baseline (like from your
mainbranch) and automatically compare your PRs against it. It calculates the probability of regression and can even generate a GitHub Actions summary for you.
See it in action
You can see it running in our own CI pipeline, where it compares pull requests against the main branch baseline:
p.s: The benchmark is using sample functions and numbers, since this crate doesnโt really have anything worthy to be benchmarked yet.
Examples (from Github summary):
| Benchmark | HEAD | main | Change | Status |
|---|---|---|---|---|
batch_transform |
27 ns | 27 ns | stable | stable |
config_parse |
2.3 us | 2.3 us | stable | stable |
request_handler |
83 ns | 72 ns | +15.3% | REGRESSION (>5%) |
token_scan |
10.3 us | 10.3 us | stable | stable |
Getting Started
If you are interested in trying it out, the repository has a dedicated examples/ crate with real-world patterns you can copy-paste:
-
library_bench.rs: A great starting point showing how to group benchmarks (e.g., separating "parsing" from "logic") and compare different implementations side-by-side.
-
ci_regression.rs: Demonstrates how to set up critical thresholds to automatically fail a build if performance drops.
I hope this tool helps save you some time and makes your optimization loops a little more enjoyable. Iโd love to hear your thoughts or feedback!
2 posts - 2 participants
๐ท๏ธ Rust_feed