Do We Really Need Async?


Maybe this is just a grumpy old geek's take while I'm sipping my espresso (after all, I'm Italian, and some habits are harder to forget, or simply I don't want to) and having some reminisce about manual memory management.

It was a dark and stormy night.

snoopy

Oh, wait... That is another story.

To be honest, it all started when I was doing a side project for a low-latency data pipeline, and I was tired of adding all these async code, plus tens of dependencies, so I had the "brilliant" idea to apply some concepts I was cooking in my head after reading some interesting blog posts (for instance Async/Await Is Real And Can Hurt You and tasks are the wrong abstraction):

what if I remove all the async code and use instead just plain old threads? Shouldn't it be easier and lighter?

Long story short: No, not easier, but definitely lighter (arguably faster) and easier to debug/test. Now, you can skip all the rest if you are not interested in some rant about Rust.

Oh, good. You're still here. I'm not even asking why, but let's move on.

At the foundation of all this madness is one simple idea: the reason for (enjoying) using Rust is the concept of fearless concurrency. And yet, over the past few years, an explosion of async/await code has popped up everywhere – even in the tiniest programs that could have been solved with a simple thread or two. Is async the cure-all it's cracked up to be, or have we just fallen for the hype?

Async: The Siren Song of "Scalability"

Async/await arrived like a flashy superhero promising to save us from "blocking I/O hell." And sometimes it delivers:

But then we got drunk on the Kool-Aid...

Async Hype Train

Let's hope on the async train. So easy!

Add async here, throw in a .await there, and – voila! – your code is now asynchronous. Until, of course, you realise that once you mark one function async, every caller up the chain also needs to be async.

async is "invasive".

The path of least resistance quickly becomes "make your entire codebase async along with just the parts that need to be async". This so‑called "function colouring" means suddenly, no library function is safe – you need async versions of almost everything. Suddenly, the standard library isn't enough, and every tiny task requires a special async crate. Fun, right?

The Dependency Black Hole

Here's where things get funny.

Want to write async Rust? Better buckle up for a dependency ride that would make a JavaScript developer blush; and this is where I drew a line (yeap, you got it: my lower bound is always Javascript; I can't go lower!).

The async ecosystem has created what I like to call the "async dependency fractal" (sorry, my inner, failed mathematician is always poking around). Every async crate depends on other async crates, which rely on more async crates, ad infinitum.

Once you decide to use async, you inevitably choose a runtime (usually Tokio) and drag a ton of crates. Corrode Rust points out that Tokio is a deep-seated requirement for many libraries: "one of your dependencies integrates Tokio, effectively nudging you towards its adoption." In fact, as of this writing, more than 27k crates on crates.io use Tokio.

In other words, by choosing async, you hitch a ride on the Tokio Bus, whether you meant to or not. Once you're on the bus, you either stick with Tokio or end up having to "rewrite it in another runtime" when you find a library leaking Tokio-specific types. Maybe this was the correct meaning of "Rewrite all in Rust". Not what I envisioned or wanted.

Consider also this absurdity of an executor coupling approach: We've created a system where documentation and examples for one async runtime simply don't work with others! The result is what the community euphemistically calls "the One True Runtime"—Tokio—which has achieved dominance and become the de facto standard of async in Rust.

The Split Ecosystem Problem

If the async runtime war wasn't enough, the most insidious issue with async Rust is that it has effectively created two distinct languages. There's "Sync Rust" and "Async Rust," and they don't play nicely together.

Want to use that perfect library you found for parsing configuration files? Too bad, it's sync, and you're in async land. You'll need to either:

This split means that library authors now have to choose between writing sync code and excluding the async crowd, writing async code and requiring the async runtime, or maintaining two versions and doubling their maintenance burden. Most choose option one or two, leading to a fragmented ecosystem.

Threads don't have this problem. A function that does work is a function that does work, regardless of whether it's called from a thread or the main execution context. Okay, that is not entirely true, with Send and Sync traits chipping in plus concurrent primitives. Nevertheless...

The C and Nginx Reality Check

Wait. Okay, I hear you. It may be Rust who has problems with async. Let's check the rest of the world.

Here's where the async narrative completely falls apart: Nginx, the web server that powers a significant portion of the Internet, is written in C and uses an event-driven architecture that doesn't rely on async/await syntax. Nginx achieves its legendary performance through a combination of event loops, non-blocking I/O, and careful memory management.

The existence of successful async mode extensions for Nginx doesn't contradict this point—it reinforces it. The base system is so solid that you can layer additional async capabilities on top when needed (!!!) rather than building async-first and hoping everything else fits.

C programmers didn't need to invent async/await because they already had the tools they needed: select, epoll, kqueue, and other multiplexing mechanisms that let you handle multiple I/O operations efficiently. These tools require more manual management than async/await but also provide more control and predictability.

The async abstraction promised to make this easier, but in practice, it often just kicks the (complexity) can down the road. Instead of reasoning about file descriptors and event loops, you're reasoning about futures, executors, and wake mechanisms. The complexity didn't disappear—it just got dressed up in fancier clothing (a bit like all those people using Python and/or Javascript and pretending to do high-performance services... just kidding, but not so much ;-P ).

The Thread Renaissance: Why Old School Still Works

While the async evangelists were busy building their tower of Babel, a quiet revolution was happening in the background (or on a background thread—nerd joke alert!).

Thread-per-core architectures like Bytedance's monoio and glommio began demonstrating that they could scale significantly better than work-stealing architecture (like the one implemented by the Tokio runtime). Suddenly, the fundamental assumption that async was inherently more scalable started looking less like scientific fact and more like marketing crap (or benchmarketing, copyright of someone wise on the Web, sorry... can't find that reference anymore).

The dirty secret that async proponents don't want to discuss is that boring, old-fashioned operating system threads are pretty good at their job. Modern operating systems have spent decades optimising thread creation, scheduling, and context switching. The kernel developers who built these systems aren't exactly amateurs, and they've been solving concurrency problems since before most async frameworks were even concepts.

I've been diving into how async works behind the scenes, thanks to some great doc and code (Tokio again), and I've found something important! It turns out that a lot of the responsibility for managing concurrency falls on our shoulders as developers: async runtimes use a cooperative scheduler, so it is up to whoever writes the async code to inform the scheduler of the state of the computation, and release all the resources properly.

Also, there's this thing called the "10µs rule" that's really important to async code. Async runtimes only schedule tasks at .await points (remember? It is cooperative). So, if your code runs for more than about 10 microseconds without hitting a .await, you could end up "starving" the scheduler. This can definitely leave beginners (and not only) scratching their heads, wondering why things aren't working quite right or as fast as expected!

Stealing schedulers, like the one (beautifully) implemented by Tokio, could come to the rescue, but with a cost. In a world optimised for very small tasks, the cost of moving the execution of a task, and especially the related data, to another CPU core could be higher than simply waiting for the computation to complete.

Not convinced yet?

Rust's standard library provides excellent thread primitives that work predictably, compile quickly, and don't require choosing between competing religious factions, and adding few dependencies we will have all the things we need for any web application or computation without any async:

For instance, for you CPU-bond tasks, std::thread::spawn doesn't care about your runtime politics—it just works. Thread pools like those provided by the rayon crate offer data parallelism without the existential crisis of wondering whether you've chosen the right async executor for your particular shade of I/O-bound work. And if you, like me, are a minimum dependencies fanatic, creating your own simple thread pool is pretty doable and enjoyable, thanks to the aforementioned well-designed thread and sync primitives of Rust.

The Beauty of Rust's Threading Primitives

Let's appreciate what we have:

std::thread::spawn

It is simple, straightforward, and does exactly what it says on the tin. Want to run something concurrently? Spawn a thread. Need to communicate between threads? Use channels. Need shared state? Use Arc<Mutex<T>> or Arc<RwLock<T>>.

use std::thread;
use std::sync::mpsc;

fn handle_requests() {
    let (tx, rx) = mpsc::channel();
    
    // Spawn worker threads
    for i in 0..4 {
        let tx = tx.clone();
        thread::spawn(move || {
            // Do computation work...
	    ...
            tx.send(format!("Result from thread {} = {}", i, res)).unwrap();
        });
    }
    // Collect results
    for _ in 0..4 {
        println!("{}", rx.recv().unwrap());
    }
}

There are no external dependencies, no runtime, and no colour-coded functions—just threads doing thread things.

std::sync::mpsc

Rust's message-passing channels are fantastic for inter-thread communication. They're zero-cost abstractions over efficient lock-free data structures and compose beautifully with the rest of the standard library.

let (tx, rx) = std::sync::mpsc::channel();  

std::thread::spawn(move || {  
 tx.send(start_computation()).unwrap(); // Bye, async!  
});  

let result = rx.recv().unwrap(); // Simple, debuggable, no .await  

PS: channels are fantastic but also quite bare; for instance, the receiver doesn't implement Send, meaning you can't send it to another thread, but the sender can be cloned easily and faster and shared with other threads (after all mpsc, means Multiple Producers Single Consumer). Tokio has a more powerful version that implements multiple consumers (broadcast); switching between the two implementations will require redesigning your solution. I've learned it the hard way, but it's doable and, in the end, enjoyable looking at your dependencies shrinking (hopefully respecting the timeline).

std::sync::Arc and Friends

Atomic reference counting with Arc, combined with Mutex or RwLock, or, even more low-level, using the Memory Ordering and Atomic APIs (std::sync::atomic) directly, provides safe shared-state concurrency without the complexity of async runtimes.

static X: AtomicI32 = AtomicI32::new(0);

fn main() {
    X.store(1, Relaxed);
    let t = thread::spawn(f);
    X.store(2, Relaxed);
    t.join().unwrap();
    X.store(3, Relaxed);
}

fn f() {
    let x = X.load(Relaxed);
    assert!(x == 1 || x == 2);
    // the assertion in this example cannot fail
}

(from the fantastic book "Rust Atomics and Locks")

Scoped Threads

AKA, borrow Without Fear.

let mut data = vec![1, 2, 3];  

std::thread::scope(|s| {  
    s.spawn(|| {  
        data.push(4); // Borrow checker approves!!!
    });  
}); // Auto-joins threads here  

Zero-cost shared memory thanks to Rust's lifetimes.

The Performance Reality Check

Let's talk numbers.

The performance difference between async and threaded approaches is negligible in most real-world applications. The bottlenecks are usually:

The scenarios where async provides significant performance benefits are specific and measurable:

For most applications—web APIs with reasonable traffic, CLI tools, desktop applications, batch processors—threads are simpler, more maintainable, and perform just as well.

MetricOS Thread (std)Async Task (Tokio)
Memory (per task)~1-10 MB~1-10 KB
Spawn/Switch Costµs rangens range
Blocking Cost"Meh""OMG PANIC"
Debugginggdb/perf"Pray to Tokio gods"

Yes, threads cost more memory. But:

When Async Makes Sense

Okay, we've roasted async enough. Haven't we?

To be fair, async isn't universally wrong—it's just bad for most applications or mostly used wrong. There are specific scenarios where async shines:

Conclusion

The async revolution promised simplicity and performance but delivered complexity and fragmentation. While async has its place in the Rust ecosystem, that place is, IMHO, much smaller than current usage patterns/numbers of online posts/talks suggest.

Choose threads when:

Choose async when:

Please don't get me wrong: async Rust isn't bad—we're just using it wrong. We've treated it as a silver bullet when it's actually a specialised tool for specific problems.

I would also like to highlight that a framework like Tokio has some of the best open-source contributors and excellent documentation. This can greatly help everyone understand how asynchronous programming works under the hood and determine the best strategy for a particular problem. I strongly recommend reviewing both the documentation and the code.

my final take? For most applications, threads offer a simpler, more maintainable alternative that works with Rust's strengths rather than against them.