Taskvisor 0.2: event-driven task supervision for Tokio (restart policies, backoff, event bus)
โ Rust ๐ 2026-06-11 ๐ค surdeus ๐๏ธ 1Hi all!
I just released taskvisor 0.2.1 and this is its first public announcement.
taskvisor is a small library on top of Tokyo (no unsafe, no heavy deps) that runs your background tasks, restarts them according to a per-task policy, and publishes a structured event for every lifecycle step.
What you get:
- Restart policies as data:
Never,OnFailure,Always { interval }per task; - Backoff with jitter: exponential / constant;
Full/Equal/Decorrelatedjitter; - A lifecycle event bus: implement one trait method
on_eventfor metrics, alerts, logging where each subscriber gets its own bounded queue, slow subscribers never block the runtime; - Panics are supervised too: a panicking task is caught, surfaced as a failure event, and retried per policy (it won't take down the process or leak)
- Dynamic management: add / cancel / remove tasks at runtime, addressed by a runtime
TaskId - Optional admission control (feature:
controller): named slots withQueue,Replace,DropIfRunningpolicies - Graceful shutdown with a grace period, then force-abort of stragglers.
Example:
A flaky task that fails twice and then succeeds with a subscriber printing what the supervisor does:
use std::sync::Arc;
use std::sync::atomic::{AtomicU32, Ordering};
use taskvisor::prelude::*;
struct Printer;
impl Subscribe for Printer {
fn on_event(&self, ev: &Event) {
if let Some(task) = ev.task.as_deref() {
println!(" {:?} (task={task})", ev.kind);
}
}
}
#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let attempts = Arc::new(AtomicU32::new(0));
let flaky: TaskRef = TaskFn::arc("flaky", move |_ctx| {
let attempts = Arc::clone(&attempts);
async move {
if attempts.fetch_add(1, Ordering::Relaxed) < 2 {
Err(TaskError::Fail { reason: "boom".into(), exit_code: None })
} else {
Ok(())
}
}
});
let spec = TaskSpec::restartable(flaky);
Supervisor::new(SupervisorConfig::default(), vec![Arc::new(Printer)])
.run(vec![spec])
.await?;
Ok(())
}
TaskAddRequested -> TaskAdded ->TaskStarting -> TaskFailed -> BackoffScheduled -> TaskStarting -> TaskFailed -> BackoffScheduled -> TaskStarting -> TaskStopped -> ActorExhausted -> TaskRemoved
What it is not:
- not an actor framework
- not a job queue
- not a tower replacement
Where it fits:
taskvisor is for long-running services that own a set of resident background tasks.
The things that must be running the whole time the process is up:
- queue consumers;
- pollers;
- sync loops;
- connection keepers;
- periodic jobs;
- embedded workers.
If a task dies, you want it restarted with backoff;
if it misbehaves, you want to see it (metrics, alerts);
if the set changes at runtime, you want to add and remove tasks without restarting the service.
That's the niche.
Links:
crates.io ยท docs.rs ยท github ยท examples
193 tests, #![forbid(unsafe_code)], MSRV 1.90.
taskvisor is the supervision core of a larger toolkit I'm building (subprocess execution, HTTP/gRPC control plane), but it stands on its own.
I'd especially appreciate feedback on the
SupervisorHandleAPI (TaskId-based addressing) and the slot/admission-control design.
Thanks!
1 post - 1 participant
๐ท๏ธ Rust_feed