Do We Really Need Async?
Maybe this is just a grumpy old geek's take while I'm sipping my espresso (after all, I'm Italian, and some habits are harder to forget, or simply I don't want to) and having some reminisce about manual memory management.
It was a dark and stormy night.

Oh, wait... That is another story.
To be honest, it all started when I was doing a side project for a low-latency data pipeline, and I was tired of adding all these async code, plus tens of dependencies, so I had the "brilliant" idea to apply some concepts I was cooking in my head after reading some interesting blog posts (for instance Async/Await Is Real And Can Hurt You and tasks are the wrong abstraction):
what if I remove all the async code and use instead just plain old threads? Shouldn't it be easier and lighter?
Long story short: No, not easier, but definitely lighter (arguably faster) and easier to debug/test. Now, you can skip all the rest if you are not interested in some rant about Rust.
Oh, good. You're still here. I'm not even asking why, but let's move on.
At the foundation of all this madness is one simple idea: the reason for (enjoying) using Rust is the concept of fearless concurrency. And yet, over the past few years, an explosion of async/await code has popped up everywhere – even in the tiniest programs that could have been solved with a simple thread or two. Is async the cure-all it's cracked up to be, or have we just fallen for the hype?
Async: The Siren Song of "Scalability"
Async/await arrived like a flashy superhero promising to save us from "blocking I/O hell." And sometimes it delivers:
- Legit use cases: Handling 10k+ network connections (e.g., chat servers, DB proxies) where OS threads choke on memory overhead (check the guide Async programming in Rust)
- Zero-cost abstractions: Futures compile to state machines without heap allocations (nice for embedded!) (same reference as before)
- I/O-bound workloads: Epoll/kqueue magic lets one thread juggle many sockets (but this can also be leveraged without async; see
NGINX
later)
But then we got drunk on the Kool-Aid...
Async Hype Train
Let's hope on the async train. So easy!
Add async here, throw in a .await there, and – voila! – your code is now asynchronous. Until, of course, you realise that once you mark one function async, every caller up the chain also needs to be async.
async is "invasive".
The path of least resistance quickly becomes "make your entire codebase async along with just the parts that need to be async". This so‑called "function colouring" means suddenly, no library function is safe – you need async versions of almost everything. Suddenly, the standard library isn't enough, and every tiny task requires a special async crate. Fun, right?
The Dependency Black Hole
Here's where things get funny.
Want to write async Rust? Better buckle up for a dependency ride that would make a JavaScript developer blush; and this is where I drew a line (yeap, you got it: my lower bound is always Javascript; I can't go lower!).
The async ecosystem has created what I like to call the "async dependency fractal" (sorry, my inner, failed mathematician is always poking around). Every async crate depends on other async crates, which rely on more async crates, ad infinitum.
Once you decide to use async, you inevitably choose a runtime (usually Tokio
) and drag a ton of crates. Corrode Rust points out that Tokio
is a deep-seated requirement for many libraries: "one of your dependencies integrates Tokio
, effectively nudging you towards its adoption." In fact, as of this writing, more than 27k crates on crates.io use Tokio.
In other words, by choosing async, you hitch a ride on the Tokio Bus, whether you meant to or not. Once you're on the bus, you either stick with Tokio or end up having to "rewrite it in another runtime" when you find a library leaking Tokio-specific types. Maybe this was the correct meaning of "Rewrite all in Rust". Not what I envisioned or wanted.
Consider also this absurdity of an executor coupling approach: We've created a system where documentation and examples for one async runtime simply don't work with others! The result is what the community euphemistically calls "the One True Runtime"—Tokio—which has achieved dominance and become the de facto standard of async in Rust.
The Split Ecosystem Problem
If the async runtime war wasn't enough, the most insidious issue with async Rust is that it has effectively created two distinct languages. There's "Sync Rust" and "Async Rust," and they don't play nicely together.
Want to use that perfect library you found for parsing configuration files? Too bad, it's sync, and you're in async land. You'll need to either:
- Wrap it in
tokio::task::spawn_blocking()
(adding overhead and complexity) - Find an async version (which may not exist or may be inferior)
- Write your own async wrapper (because that's what you wanted to spend your weekend doing)
This split means that library authors now have to choose between writing sync code and excluding the async crowd, writing async code and requiring the async runtime, or maintaining two versions and doubling their maintenance burden. Most choose option one or two, leading to a fragmented ecosystem.
Threads don't have this problem. A function that does work is a function that does work, regardless of whether it's called from a thread or the main execution context. Okay, that is not entirely true, with Send
and Sync
traits chipping in plus concurrent primitives. Nevertheless...
The C and Nginx Reality Check
Wait. Okay, I hear you. It may be Rust who has problems with async. Let's check the rest of the world.
Here's where the async narrative completely falls apart: Nginx
, the web server that powers a significant portion of the Internet, is written in C
and uses an event-driven architecture that doesn't rely on async/await syntax. Nginx achieves its legendary performance through a combination of event loops, non-blocking I/O, and careful memory management.
The existence of successful async mode extensions for Nginx doesn't contradict this point—it reinforces it. The base system is so solid that you can layer additional async capabilities on top when needed (!!!) rather than building async-first and hoping everything else fits.
C programmers didn't need to invent async/await because they already had the tools they needed: select, epoll, kqueue, and other multiplexing mechanisms that let you handle multiple I/O operations efficiently. These tools require more manual management than async/await but also provide more control and predictability.
The async abstraction promised to make this easier, but in practice, it often just kicks the (complexity) can down the road. Instead of reasoning about file descriptors and event loops, you're reasoning about futures, executors, and wake mechanisms. The complexity didn't disappear—it just got dressed up in fancier clothing (a bit like all those people using Python and/or Javascript and pretending to do high-performance services... just kidding, but not so much ;-P ).
The Thread Renaissance: Why Old School Still Works
While the async evangelists were busy building their tower of Babel, a quiet revolution was happening in the background (or on a background thread—nerd joke alert!).
Thread-per-core architectures like Bytedance's monoio
and glommio
began demonstrating that they could scale significantly better than work-stealing architecture (like the one implemented by the Tokio
runtime). Suddenly, the fundamental assumption that async was inherently more scalable started looking less like scientific fact and more like marketing crap (or benchmarketing, copyright of someone wise on the Web, sorry... can't find that reference anymore).
The dirty secret that async proponents don't want to discuss is that boring, old-fashioned operating system threads are pretty good at their job. Modern operating systems have spent decades optimising thread creation, scheduling, and context switching. The kernel developers who built these systems aren't exactly amateurs, and they've been solving concurrency problems since before most async frameworks were even concepts.
I've been diving into how async works behind the scenes, thanks to some great doc and code (Tokio
again), and I've found something important! It turns out that a lot of the responsibility for managing concurrency falls on our shoulders as developers: async runtimes use a cooperative scheduler, so it is up to whoever writes the async code to inform the scheduler of the state of the computation, and release all the resources properly.
Also, there's this thing called the "10µs rule" that's really important to async code. Async runtimes only schedule tasks at .await points (remember? It is cooperative). So, if your code runs for more than about 10 microseconds without hitting a .await, you could end up "starving" the scheduler. This can definitely leave beginners (and not only) scratching their heads, wondering why things aren't working quite right or as fast as expected!
Stealing schedulers, like the one (beautifully) implemented by Tokio
, could come to the rescue, but with a cost. In a world optimised for very small tasks, the cost of moving the execution of a task, and especially the related data, to another CPU core could be higher than simply waiting for the computation to complete.
Not convinced yet?
Rust's standard library provides excellent thread primitives that work predictably, compile quickly, and don't require choosing between competing religious factions, and adding few dependencies we will have all the things we need for any web application or computation without any async:
std::thread
(built-in, zero dependencies)std::sync
(built-in, zero dependencies)ureq
(simple HTTP client, no async runtime needed)rusqlite
or direct SQL driver (no async runtime needed)serde
andserde_json
(no async runtime needed)
For instance, for you CPU-bond tasks, std::thread::spawn
doesn't care about your runtime politics—it just works. Thread pools like those provided by the rayon
crate offer data parallelism without the existential crisis of wondering whether you've chosen the right async executor for your particular shade of I/O-bound work. And if you, like me, are a minimum dependencies fanatic, creating your own simple thread pool is pretty doable and enjoyable, thanks to the aforementioned well-designed thread and sync primitives of Rust.
The Beauty of Rust's Threading Primitives
Let's appreciate what we have:
std::thread::spawn
It is simple, straightforward, and does exactly what it says on the tin. Want to run something concurrently? Spawn a thread. Need to communicate between threads? Use channels. Need shared state? Use Arc<Mutex<T>>
or Arc<RwLock<T>>
.
use std::thread;
use std::sync::mpsc;
fn handle_requests() {
let (tx, rx) = mpsc::channel();
// Spawn worker threads
for i in 0..4 {
let tx = tx.clone();
thread::spawn(move || {
// Do computation work...
...
tx.send(format!("Result from thread {} = {}", i, res)).unwrap();
});
}
// Collect results
for _ in 0..4 {
println!("{}", rx.recv().unwrap());
}
}
There are no external dependencies, no runtime, and no colour-coded functions—just threads doing thread things.
std::sync::mpsc
Rust's message-passing channels are fantastic for inter-thread communication. They're zero-cost abstractions over efficient lock-free data structures and compose beautifully with the rest of the standard library.
let (tx, rx) = std::sync::mpsc::channel();
std::thread::spawn(move || {
tx.send(start_computation()).unwrap(); // Bye, async!
});
let result = rx.recv().unwrap(); // Simple, debuggable, no .await
PS: channels are fantastic but also quite bare; for instance, the receiver doesn't implement Send, meaning you can't send it to another thread, but the sender can be cloned easily and faster and shared with other threads (after all mpsc, means Multiple Producers Single Consumer). Tokio has a more powerful version that implements multiple consumers (broadcast); switching between the two implementations will require redesigning your solution. I've learned it the hard way, but it's doable and, in the end, enjoyable looking at your dependencies shrinking (hopefully respecting the timeline).
std::sync::Arc
and Friends
Atomic reference counting with Arc, combined with Mutex or RwLock, or, even more low-level, using the Memory Ordering and Atomic APIs (std::sync::atomic) directly, provides safe shared-state concurrency without the complexity of async runtimes.
static X: AtomicI32 = AtomicI32::new(0);
fn main() {
X.store(1, Relaxed);
let t = thread::spawn(f);
X.store(2, Relaxed);
t.join().unwrap();
X.store(3, Relaxed);
}
fn f() {
let x = X.load(Relaxed);
assert!(x == 1 || x == 2);
// the assertion in this example cannot fail
}
(from the fantastic book "Rust Atomics and Locks")
Scoped Threads
AKA, borrow Without Fear.
let mut data = vec![1, 2, 3];
std::thread::scope(|s| {
s.spawn(|| {
data.push(4); // Borrow checker approves!!!
});
}); // Auto-joins threads here
Zero-cost shared memory thanks to Rust's lifetimes.
The Performance Reality Check
Let's talk numbers.
The performance difference between async and threaded approaches is negligible in most real-world applications. The bottlenecks are usually:
- Network I/O: Whether you're waiting on a network call in an async task or a thread, you're still waiting on the network.
- Database queries: Same story—the database is the bottleneck, not your concurrency model.
- Business logic: CPU-bound work doesn't benefit from async at all.
The scenarios where async provides significant performance benefits are specific and measurable:
- Handling thousands of concurrent connections
- Very memory-constrained environments
- Applications with extremely frequent context-switching
For most applications—web APIs with reasonable traffic, CLI tools, desktop applications, batch processors—threads are simpler, more maintainable, and perform just as well.
Metric | OS Thread (std) | Async Task (Tokio) |
---|---|---|
Memory (per task) | ~1-10 MB | ~1-10 KB |
Spawn/Switch Cost | µs range | ns range |
Blocking Cost | "Meh" | "OMG PANIC" |
Debugging | gdb/perf | "Pray to Tokio gods" |
Yes, threads cost more memory. But:
- For hundreds of parallel tasks (not 100k), threads are FINE.
rayon
parallel iterators make CPU-bound work trivial
When Async Makes Sense
Okay, we've roasted async enough. Haven't we?
To be fair, async isn't universally wrong—it's just bad for most applications or mostly used wrong. There are specific scenarios where async shines:
- when you need to maintain thousands of idle connections (like chat servers or real-time notification systems),
- when you're building infrastructure that needs to handle massive concurrency,
- when you're working in resource-constrained environments where memory usage is critical.
Conclusion
The async revolution promised simplicity and performance but delivered complexity and fragmentation. While async has its place in the Rust ecosystem, that place is, IMHO, much smaller than current usage patterns/numbers of online posts/talks suggest.
Choose threads when:
- Your concurrency needs are "modest", so to speak (dozens to hundreds of concurrent operations); you can always use thread pools to scale even more
- You value simplicity and maintainability over theoretical maximum throughput.
- Your team is more familiar with traditional threading models.
- You want to minimise external dependencies.
- You're building CLI tools, desktop applications, or "normal" web services (not at Google level of use)
- Your workload includes significant CPU-bound computation (this point is pretty important, though)
- You value predictability and observability.
Choose async when:
- You need to handle thousands+ of concurrent I/O operations
- Memory usage is a critical constraint
- You're building on existing async infrastructure
- Your application is purely I/O-bound with minimal CPU work
- You have custom scheduling requirements, leveraging the cooperative approach of an async scheduler (vs the pre-emptive OS thread scheduler)
- You don't want to end up like me, rewriting most of your services at 3 AM just because you decided that you want to get rid of all those async/await calls.
Please don't get me wrong: async Rust isn't bad—we're just using it wrong. We've treated it as a silver bullet when it's actually a specialised tool for specific problems.
I would also like to highlight that a framework like Tokio
has some of the best open-source contributors and excellent documentation. This can greatly help everyone understand how asynchronous programming works under the hood and determine the best strategy for a particular problem. I strongly recommend reviewing both the documentation and the code.
my final take? For most applications, threads offer a simpler, more maintainable alternative that works with Rust's strengths rather than against them.