Tokio's `block_in_place` and deferred wakers
⚓ Rust 📅 2026-02-02 👤 surdeus 👁️ 8Hi, everyone.
So, in the project I'm working on there is a bunch of long-running "subsystem" tasks, each of which looks like this (pseudo-code):
let subsystem_object = ...;
loop {
select! {
biased;
result = (&mut shutdown_rx) => {
// shutdown_rx is a one-shot receiver indicating
// that the app is being shut down.
break;
}
Some(action) = action_rx.recv() => {
// "action_rx" is mpsc::UnboundedReceiver.
// "action" contains a closure (returning a future) to run on
// subsystem_object and a one-shot sender to send the result
// back to the "caller".
run the closure on subsystem_object, await, and send the result.
}
}
}
Normally, a "subsystem object" is async, but there are 2 subsystems, A and B, in which they're sync.
A's object needs to call B, so it sends an action to B's channel as usual and then waits for the result via block_in_place.
I know this is not pretty but IMO it should work - the blocking call is only made in one direction, from A to B, and there exists at most one such call at a time.
However under a particular scenario (particularly slow machine under heavy load where other subsystems bombard B with actions to perform) B stalls completely and remains in this state indefinitely until shutdown is initiated.
When it happens I can see the following:
- B is not coping well with the load, e.g. its action_rx usually contains over a dozen unprocessed actions.
- Right before calling
block_in_place, A switches to the same thread where B is running. - After A calls
block_in_place, B is never polled again, until shutdown.
After browsing Tokio (1.49) source code for a while I think the following might be happening under the hood:
- B's budget is exhausted at some point, so when it's polled,
select!'s future returnsPendingand the waker is deferred. - When A calls
block_in_place, the current "worker core" gets stolen and put on another thread viaspawn_blocking. But the deferred waker stays on the current thread and it's only triggered once the blocking call completes. - Since B's action channel already had a bunch of items in it when it was last checked, it didn't register a waker, so sending new items to the channel won't wake it up.
- A's call to
block_in_placecan only finish when B handles the action, but B won't be informed of new actions untilblock_in_placefinishes, because A and B are both on the same thread.
The stalling goes away if I wrap the whole select! in a call to task::unconstrained.
So my questions are:
-
First of all, is my understanding of the Tokio machinery correct?
-
If so, can the described behavior be considered a bug that should be reported? (e.g. perhaps the deferred waker should be triggered before doing the blocking call).
-
Is there a better workaround than using
unconstrained?
I mean, I understand that the correct solution is to at least make A async and usespawn_blockinginstead ofblock_in_place(or ideally make them both async and get rid of the blocking calls altogether), but this will be a major refactoring.
So I'm looking for an "easy" workaround at this moment.
2 posts - 2 participants
🏷️ Rust_feed