Callbacks, synchronous and asynchronous
by havoc
Here are two guidelines for designing APIs that use callbacks, to add to my inadvertent collection of posts about minor API design points. I’ve run into the “sync vs. async” callback issue many times in different places; it’s a real issue that burns both API designers and API users.
Most recently, this came up for me while working on Hammersmith, a callback-based Scala API for MongoDB. I think it’s a somewhat new consideration for a lot of people writing JVM code, because traditionally the JVM uses blocking APIs and threads. For me, it’s a familiar consideration from writing client-side code based on an event loop.
Definitions
- A synchronous callback is invoked before a function returns, that is, while the API receiving the callback remains on the stack. An example might be:
list.foreach(callback)
; whenforeach()
returns, you would expect that the callback had been invoked on each element. - An asynchronous or deferred callback is invoked after a function returns, or at least on another thread’s stack. Mechanisms for deferral include threads and main loops (other names include event loops, dispatchers, executors). Asynchronous callbacks are popular with IO-related APIs, such as
socket.connect(callback)
; you would expect that whenconnect()
returns, the callback may not have been called, since it’s waiting for the connection to complete.
Guidelines
Two rules that I use, based on past experience:
- A given callback should be either always sync or always async, as a documented part of the API contract.
- An async callback should be invoked by a main loop or central dispatch mechanism directly, i.e. there should not be unnecessary frames on the callback-invoking thread’s stack, especially if those frames might hold locks.
How are sync and async callbacks different?
Sync and async callbacks raise different issues for both the app developer and the library implementation.
Synchronous callbacks:
- Are invoked in the original thread, so do not create thread-safety concerns by themselves.
- In languages like C/C++, may access data stored on the stack such as local variables.
- In any language, they may access data tied to the current thread, such as thread-local variables. For example many Java web frameworks create thread-local variables for the current transaction or request.
- May be able to assume that certain application state is unchanged, for example assume that objects exist, timers have not fired, IO has not occurred, or whatever state the structure of a program involves.
Asynchronous callbacks:
- May be invoked on another thread (for thread-based deferral mechanisms), so apps must synchronize any resources the callback accesses.
- Cannot touch anything tied to the original stack or thread, such as local variables or thread-local data.
- If the original thread held locks, the callback will be invoked outside them.
- Must assume that other threads or events could have modified the application’s state.
Neither type of callback is “better”; both have uses. Consider:
list.foreach(callback)
in most cases, you’d be pretty surprised if that callback were deferred and did nothing on the current thread!
But:
socket.connect(callback)
would be totally pointless if it never deferred the callback; why have a callback at all?
These two cases show why a given callback should be defined as either sync or async; they are not interchangeable, and don’t have the same purpose.
Choose sync or async, but not both
Not uncommonly, it may be possible to invoke a callback immediately in some situations (say, data is already available) while the callback needs to be deferred in others (the socket isn’t ready yet). The tempting thing is to invoke the callback synchronously when possible, and otherwise defer it. Not a good idea.
Because sync and async callbacks have different rules, they create different bugs. It’s very typical that the test suite only triggers the callback asynchronously, but then some less-common case in production runs it synchronously and breaks. (Or vice versa.)
Requiring application developers to plan for and test both sync and async cases is just too hard, and it’s simple to solve in the library: If the callback must be deferred in any situation, always defer it.
Example case: GIO
There’s a great concrete example of this issue in the documentation for GSimpleAsyncResult in the GIO library, scroll down to the Description section and look at the example about baking a cake asynchronously. (GSimpleAsyncResult is equivalent to what some frameworks call a future or promise.) There are two methods provided by this library, a complete_in_idle()
which defers callback invocation to an “idle handler” (just an immediately-dispatched one-shot main loop event), and plain complete()
which invokes the callback synchronously. The documentation suggests using complete_in_idle()
unless you know you’re already in a deferred callback with no locks held (i.e. if you’re just chaining from one deferred callback to another, there’s no need to defer again).
GSimpleAsyncResult is used in turn to implement IO APIs such as g_file_read_async()
, and developers can assume the callbacks used in those APIs are deferred.
GIO works this way and documents it at length because the developers building it had been burned before.
Synchronized resources should defer all callbacks they invoke
Really, the rule is that a library should drop all its locks before invoking an application callback. But the simplest way to drop all locks is to make the callback async, thereby deferring it until the stack unwinds back to the main loop, or running it on another thread’s stack.
This is important because applications can’t be expected to avoid touching your API inside the callback. If you hold locks and the app touches your API while you do, the app will deadlock. (Or if you use recursive locks, you’ll have a scary correctness problem instead.)
Rather than deferring the callback to a main loop or thread, the synchronized resource could try to drop all its locks; but that can be very painful because the lock might be well up in the stack, and you end up having to make each method on the stack return the callback, passing the callback all the way back up the stack to the outermost lock holder who then drops the lock and invokes the callback. Ugh.
Example case: Hammersmith without Akka
In Hammersmith as originally written, the following pseudocode would deadlock:
connection.query({ cursor => /* iterate cursor here, touching connection again */ })
Iterating the cursor will go back through the MongoDB connection. The query callback was invoked from code in the connection object… which held the connection lock. Not going to work, but this is natural and convenient code for an application developer to write. If the library doesn’t defer the callback, the app developer has to defer it themselves. Most app developers will get this wrong at first, and once they catch on and fix it, their code will be cluttered by some deferral mechanism.
Hammersmith inherited this problem from Netty, which it uses for its connections; Netty does not try to defer callbacks (I can understand the decision since there isn’t an obvious default/standard/normal/efficient way to defer callbacks in Java).
My first fix for this was to add a thread pool just to run app callbacks. Unfortunately, the recommended thread pool classes that come with Netty don’t solve the deadlock problem, so I had to fix that. (Any thread pool that solves deadlock problems has to have an unbounded size and no resource limits…)
In the end it works, but imagine what happens if callback-based APIs become popular and every jar you use with a callback in its API has to have its own thread pool. Kind of sucks. That’s probably why Netty punts on the issue. Too hard to make policy decisions about this in a low-level networking library.
Example case: Akka actors
Partly to find a better solution, next I ported Hammersmith to the Akka framework. Akka implements the Actor model. Actors are based on messages rather than callbacks, and in general messages must be deferred. In fact, Akka goes out of its way to force you to use an ActorRef to communicate with an actor, where all messages to the actor ref go through a dispatcher (event loop). Say you have two actors communicating, they will “call back” to each other using the !
or “send message” method:
actorOne ! Request("Hello") // then in actorOne sender ! Reply("World")
These messages are dispatched through the event loop. I was expecting my deadlock problems to be over in this model, but I found a little gotcha – the same issue all over again, invoking application callbacks with a lock held. This time it was the lock on an actor while the actor is processing a message.
Akka actors can receive messages from either another actor or from a Future
, and Akka wraps the sender in an object called Channel
. The !
method is in the interface to Channel
. Sending to an actor with !
will always defer the message to the dispatcher, but sending to a future will not; as a result, the !
method on Channel
does not define sync vs. async in its API contract.
This becomes an issue because part of the “point” of the actor model is that an actor runs in only one thread at a time; actors are locked while they’re handling a message and can’t be re-entered to handle a second message. Thus, making a synchronous call out from an actor is dangerous; there’s a lock held on the actor, and if the synchronous call tries to use the actor again inside the callback, it will deadlock.
I wrapped MongoDB connections in an actor, and immediately had exactly the same deadlock I’d had with Netty, where a callback from a query would try to touch the connection again to iterate a cursor. The query callback came from invoking the !
method on a future. The !
method on Channel
breaks my first guideline (it doesn’t define sync vs. async in the API contract), but I was expecting it to be always async; as a result, I accidentally broke my second guideline and invoked a callback with a lock held.
If it were me, I would probably put deferral in the API contract for Channel.!
to fix this; however, as Akka is currently written, if you’re implementing an actor that sends replies, and the application’s handler for your reply may want to call back and use the actor again, you must manually defer sending the reply. I stumbled on this approach, though there may be better ones:
private def asyncSend(channel: AkkaChannel[Any], message: Any) = { Future(channel ! message, self.timeout)(self.dispatcher) }
An unfortunate aspect of this solution is that it double-defers replies to actors, in order to defer replies to futures once.
The good news about Akka is that at least it has this solution – there’s a dispatcher to use! While with plain Netty, I had to use a dedicated thread pool.
Akka gives an answer to “how do I defer callbacks,” but it does require special-casing futures in this way to be sure they’re really deferred.
(UPDATE: Akka team is already working on this, here’s the ticket.)
Conclusion
While I found one little gotcha in Akka, the situation is much worse on the JVM without Akka because there isn’t a dispatcher to use.
Callback-based APIs really work best if you have an event loop, because it’s so important to be able to defer callback invocation.
That’s why callbacks work pretty well in client-side JavaScript and in node.js, and in UI toolkits such as GTK+. But if you start coding a callback-based API on the JVM, there’s no default answer for this critical building block. You’ll have to go pick some sort of event loop library (Akka works great), or reinvent the equivalent, or use a bloated thread-pools-everywhere approach.
Since callback-based APIs are so trendy these days… if you’re going to write one, I’d think about this topic up front.
You write that in asynchronous callbacks “apps must synchronize any resources the callback accesses” and then you go on explaining how sync vs. async is implemented in GIO.
That got me thinking. Aren’t deferred callbacks passed to g_file_read_async() (and most, if not all async GIO functions) always executed in the main loop?
At least I recall that g_simple_async_result_complete_in_idle() is almost always used at the end of threaded operations, which means that callbacks are executed in an idle handler (so in the main loop context) and that users do not need to worry about protecting data etc.
I bet there are ways to implement async functions with GAsyncResult where the callbacks are executed in the thread where the async operation was performed as well… but so far I never had to protect data in callbacks passed to any of the async GIO functions.
So either that part about GIO is misleading or I misunderstood what you were trying to say. 😉
Right, GIO always defers to the main loop thread and not a new thread, I think. The post is a little muddled because I was trying to be generic about deferral mechanisms. With GLib-based stuff generally there’s a goal to hide threads and make it so apps only see the one thread. But in the Java/Akka stuff the “main loop” (dispatcher, event loop, whatever you want to call it) typically runs event handlers in a thread pool.
With the GLib single-main-loop-thread approach, you do still have a certain class of reentrancy problem (state changing in between callbacks) but you don’t need (or get to use) thread synchronization tools.
Yep, the bit about state changes between the call and the callback is correct even for GIO of course.
An example of a bad API that might complete synchronously (on the current thread) or asynchronously (on an arbitrary thread) is Microsoft .NET’s IAsyncResult delegates. This API leads to confusion about critical sections and conflicting advice about resource cleanup (such whether event’s End() cleanup function should still be called or can be called safely from the synchronous callback). UGH!
http://msdn.microsoft.com/en-us/library/ms228963.aspx
Great article. I’ve worked with lots of callback-based APIs over the years and yet never consciously made the sync vs. async distinction. I think of the sync callback more like just a way of providing first-class functions and closures in languages that lack them (“foreach” is just another name for “map”, right?). The rules you propose for callback libraries make a lot of sense.
Food for thought: I’ve been playing with the Go programming language lately, and it includes some interesting concurrency primitives. Instead of Actors, Go has goroutines (cf. lightweight threads with segmented stacks). Goroutines typically communicate via channels, which are a first-class datatype in Go. (I don’t know enough about Scala Actors to say whether Go’s model is substantially different, so I’ll continue…)
This has an interesting effect on Go API design: (1) most APIs are synchronous, since you can trivially make call “foo(bar)” asynchronous by writing “go foo(bar)” instead, and (2) deferred results are typically returned via channels, not callbacks.
For example, if “foo(bar)” returns an integer result that I want to get later, I would write:
c := make(chan int) // make a channel of ints
go func() { c - foo(bar) }() // start a goroutine that will send foo(bar)'s result on channel c // ... do some other stuff ... result := async transformations via threads and callbacks (or futures), but goroutines and channels make this lightweight, and channels provide nice synchronization guarantees while avoiding lots of the issues you describe above with callback-based APIs (though as you alluded to, there are always ways to get deadlocks when you are holding a lock and then wait on some external processing to complete).
Looks like the middle of my comment got gobbled somehow; here’s the edit:
// … do some other stuff …
result := async transformations via threads and callbacks (or futures), but goroutines and channels make this lightweight, and channels provide nice synchronization guarantees while avoiding lots of the issues you describe above with callback-based APIs (though as you alluded to, there are always ways to get deadlocks when you are holding a lock and then wait on some external processing to complete).
Stupid HTML parsing …
// … do some other stuff …
result := – c // get foo(bar)’s result
The more common way of doing this is by having a goroutine that just processes “foo” requests, and clients send it (bar, c1) or (baz, c2) over a channel. The “foo” goroutine processes these requests serially (so no locks are needed on any state that “foo” accesses), and results are sent back to the client via the channels they sent along with the request. This is much like a distributed system in which a server replies to clients via the sockets connected to those clients.
Other languages can do similar sync-> async transformations via threads and callbacks (or futures), but goroutines and channels make this lightweight, and channels provide nice synchronization guarantees while avoiding lots of the issues you describe above with callback-based APIs (though as you alluded to, there are always ways to get deadlocks when you are holding a lock and then wait on some external processing to complete).
This discussion increases my motivation to continue working through my Erlang book!
[…] Callbacks, synchronous and asynchronous […]
Wouldn’t you say that Netty 4.0 fixes the problems you describe ?
http://netty.io/wiki/new-and-noteworthy.html
It’s a pity I found such an interesting post only now. However, I want to share my experience in async programming. It’s true that callbacks are either synchronous or asynchronous, and asynchronous callbacks should be deferred, but I’d like to specify – deferred relative to what? Each callback has its context of execution, and callback invocations should be serialized relative that context. That context is not necessarily the current context, and in such a case, callback invocation can start immediately, before the initiating procedure returns. In actor model, each actor forms its own context, so messages in different actors can be processed in parallel. Frankly, I cannot understand you deadlock problems: since connection and cursor callbacks belong to the same context, using separate thread pool for callbacks cannot solve the problem.
This is dated, but as I am picking up Scala now, and soon I will embark on Akka, I wanted to comment. This post made me further appreciate how correctly Erlang / OTP gen_server behavior implements this concept. Use gen_server:call for sync, and gen_server:cast for async behavior. To handle them, you use handle_call and handle_cast functions. It can’t be simpler than that!
It should be noted that the article describes the behavior of Akka 1.x; since Akka 2.0 (released in March 2012) there has been only asynchronous message sending.
[…] enforcing strict async behavior in your async functions is fragile. For example, a common code pattern is: check if I have a result in memory already and return it if […]
Good post. This is a tricky situation. I was trying to design an akka like actor system with an optimization where actors that are scheduled on the same thread just use sync callbacks instead of the more “expensive” queuing. This proved to be problematic for actors that msg themselves – self ! msg. This would cause a receive block to be invoked in the middle of another receive block. I decided to not do it for msgs to self and thought it would work for msgs to other actors scheduled on the same thread. Turns out even that is not a good idea and is error prone in case of cyclical patterns. For example actor A msgs actor B. In the middle of Actor A’s processing function we invoke Actor B’s processing function. This is fine and expected and cannot be differentiated from an async msg from Actor B’s point of view. But if Actor B now msgs Actor A again we enter Actor A’s processing function a second time violating the actor contract. I gave up on this optimization and instead use a local queue to queue up msgs to actors on the same thread. This is still better in terms of perf than queuing across cores but doesn’t have the nasty problems from sync callbacks.
[…] not to release Zalgo, possibly by using synthetic deferrals (say what?) so that you don’t make your API’s users unhappy. In the world of “lean” and MEAN MVPs, who has time to learn about leaky abstractions […]
Great Article, but i have a question on C callbacks.
Lets see a scenario,
int just_a_c_function(){
callAfter2Sec(callback);
..
..
..
// normal execution passed 2 sec here (main thread)
..
..
return 0
}
then, where is that callback will be called? Please explain.
will it be called after just_a_c_function returns?
will it be called after 2 sec in separate thread?
how do i design UI related app architecture to overcome such scenarios?
Thanks in advanced.
Short answer: It could work in any of those ways; it depends on the framework you’re using or how you implement it yourself.