State of `async/await`: unrestrained cooperation is not cooperative

⚓ Rust    📅 2025-06-25    👤 surdeus    đŸ‘ī¸ 6      

surdeus

Warning

This post was published 47 days ago. The information described in this article may have changed.

Disclaimer: an arbitrarily deliberate use of strongly worded language coming next. No offense implied or intended. Nor is there any implicit attempt to devalue the work of countless people that lead up to this point. So as to make this discussion as constructive as possible: for every issue / opinion / argument presented the LCD solution will be provided. Reader's discretion advised.


I've been looking into the Rust's approach to async/await for a while now. The deeper into the weeds, the stronger my "something's clearly wrong here" sense has got. From the lack of any comprehensive end-to-end resource on the matter to the mountain of edge cases to consider.

The following is an attempt to piece together the current state of affairs - as of June 2025, one; highlight some of the most glaring (IMPO) shortcomings to the current design and implementation, two; as well as to brainstorm the most sensible / efficient / productive way forward - from now on.

In order from the least to the most significant:

[1]

On the documentation side, having both the async and the await keyword say:

We have written an async book detailing async/await and trade-offs compared to using threads.

When the actual book in question immediately contradicts the "written" part:

NOTE: this guide is currently undergoing a rewrite after a long time without much work. It is work in progress, much is missing, and what exists is a bit rough.

Lands somewhere between "a bit surprising" and "outright embarrassing" for me. Why does it say "written" when it's clearly "unfinished"? Is the documentation wrong/out-of-date? Is there some fully "written" version of the book elsewhere that the new "undergoing" rewrite fails to mention? What should be expected of a newcomer who stumbles upon this for the first time - if not confusion?

Solution/s (click for more details)

[2]

Fragmentation of the ecosystem. I'm not talking about the choice in between tokio or smol or async_std or async-global-executor or futures-executor or futures_lite::future::block_on. I'm talking about the absolutely gargantuan amount of careful plug-A-from-X with use-B-from-Y all while making sure neither X or Y use the set of utils or wrappers or adapters from Z which you might still need D and E from; all the while making sure you don't use any of the F or G or H from it: since they were explicitly reimplemented in X or Y altogether and are no longer compatible.

Example: opening a file, reading it line by line, enumerating each one in the process. [1]

std (click for more details) tokio (click for more details)

This isn't about tokio alone. smol has their futures_lite which reinvents the AsyncBufReadExt wheel yet again. async_std has its own set. tokio::pin! is not the same as std::pin:pin! while its tokio::join! seems identical to futures::join! with no parallel for futures::join_all at all.

Poll-based async/await is sufficiently hard as it is. There are more than enough variables to keep track of: given the point [3] to follow especially. Complicating things even further by reduplicating / reinventing / rehashing the same few methods across a dozen different crates makes no sense.

To be perfectly clear: this isn't about opting in/out of nightly or unstable channels with its AsyncIterator and/or core::stream::Stream; but minimizing the amount of friction and cognitive load people must subject themselves to. Both newcomers and people versed in the sync side only.

Solution/s (click for more details)

[3]

My memory might be playing a few tricks on me at this point, yet for some reason I still remember rather well a handful of comments with regards to the way this language handled assumptions. Especially the assumptions regarding the ability of the developer behind it to do the right thing.

One phrase in particular stuck out more than usual. It was -

The Pit of Success: in stark contrast to a summit, a peak, or a journey across a desert to find victory through many trials and surprises, we want our customers to simply fall into winning practices by using our platform and frameworks. To the extent that we make it easy to get into trouble we fail. - Falling Into The Pit of Success

From the borrow checker to the exclusive/mutable vs shared/read-only &'s to the Sync and Send markers: everything seemed to have been built around the same few core tenets.

  1. people are not that smart: no matter how strongly they might feel of the contrary
  2. they will make mistakes: regardless of the extent of their knowledge and experience
  3. it is not their fault alone: even the best craftsman can only do so much with a horrible tool

Which is a perfectly reasonable set of assumptions to hold.

Unless we're talking about async/await.

Suddenly: you're hereby required to be [1] knowledgeable enough to [2] avoid all the mistakes you can possibly make while porting any of the blocking code you might have had in mind into the realm of asynchronous execution; and should you fail a task so trivial - it is definitely [3] your own fault.

If you fail at any of the three, things get even more interesting. You code will compile perfectly fine. It will run perfectly fine. Some of the time, at least. Until a section of your code that never once blocked during a #[test] run gets busy processing some abnormally large chunk of data.

Suddenly: things just freeze. Until they don't. Until they do again. Reproducible? Some of the time. Unexpected? Definitely. Infuriating? Always. If only you were [1] a tiny bit smarter you would have realized that it is absolutely critical for you to [2] never leave any section of code, no matter how seemingly transitory at a glance, to chance with regards to its ability to block on a given task / worker / thread. Unfortunately, [3] you made a mistake. async/await was never to blame.

Or was it?

  • why is the underlying impl Future in no way restrained by default?
  • how come the cancellation safety is entirely optional?
  • what is the async alternative to the std::thread::yield_now() call?
    tokio::task::yield_now() only adds an .await point to an existing async block;
    assuming the need to yield from an arbitrary execution point within - what's the way?

Without any semblance of an enforced restraint or a preemptive capability of the underlying executor: there can't be an async "pit of success". It is far too easy to skip over a while / loop / for; to forget to use a non-blocking alternative to an otherwise perfectly valid executable section; to keep track of the number of CPU cycles until the next .await point within each and every Task.

Expecting people to do all of the above and more is not much different from expecting them to keep track of each and every raw pointer to each and every heap allocation across each and every thread they are ever going to spawn and interact with. We know how "well" it works in C/C++.

Solution/s? Implicit configurable restraint (in ops/cycles) for each impl Future might be a good start. Mirroring the implicit #repr(Rust) on any enum / union / struct declaration:

// (1) `restrained` by default = auto `.await` every X ops
#[restrained|restrained(ops: 50)|unrestrained|] 
async {
    async_fn_1().await;
    // (2) suspend in place or revert 
    // to the last `Poll::Pending` point
    std::task::yield_now(); 
    // (3) are we talking fibers at this point?
    sync_call_async_spawn();
}

fn sync_call_async_spawn() {
    let mut str = String::new();
    let task_current = std::task::current();
    let str_task_local = std::task::spawn_local(async {
        async_fn_3(&mut str).await;
        task_current.unpark();
    });
    // suspend + yield_now
    std::task::park(); 
}

Alternative option: a whole bunch of lints all over the crate with clippy or similar. Not only the linter would have to scan through the entire codebase and separate potentially (costly) blocking sections from the rest of async code. Twisting people's arms into inspecting all of their projects all over just to stop a linter from screaming them doesn't feel like the most sensible solution out there, however.

Bring your own suggestions and post them up/down below. I'm not particularly attached to any particular spectrum of solutions: only (ever so) mildly dissatisfied with the current status quo.


  1. What's even more amusing here is I can clearly remember myself having the exact same issue a few years ago. Back when I had no clue or interest in why std::pin::Pin<&mut Self> was the receiving argument of the poll; when reading through the definition of the Future trait itself gave me a headache; and when trying to implement the trait itself from scratch myself seemed just insane. I think I never quite managed to get that exact combination going, too. â†Šī¸Ž

1 post - 1 participant

Read full topic

đŸˇī¸ rust_feed