Demystifying async Lua in Neovim

On my routine Neovim propaganda duty, I was asked about async Lua. I could not find a reasonably simple explanation in the top search results. Here is my attempt.

Neovim and NodeJS both rely on libuv to provide asynchronous I/O, so it is not surprising to find the same callback-based concurrency pattern at the core of many Neovim operations.

At the language level, however, JavaScript and Lua do not provide the same concurrency primitives. JavaScript has a high-level async/await syntax building on promises, while Lua provides asymmetric coroutines.

This article explains how async operations are handled in Lua using callbacks and coroutines, and how a higher-level async/await-like pattern can be built using promises.

§
Callbacks

A simple mechanism to schedule concurrent operations is a callback function. Instead of blocking, an asynchronous function accepts a callback that it will execute upon completion, passing the result of the computation.

The following function does a very slow async string concatenation which takes 1 second to execute, so it accepts a callback. This callback is internally passed to vim.defer_fn and invoked after 1 second with the concatenation result:

callbacks.lua

local function concat(left, right, callback)
    vim.defer_fn(function()
        callback(left .. right)
    end, 1000)
end

For example, we can print the result of a..b from within the callback:

callbacks.lua

concat("a", "b", function(ab)
    vim.print(ab)
end)

pro tip

It's very easy to run Lua code within Neovim: paste it inside a buffer and run :%lua. You can also save to a file and run :luafile %.

:luafile %
ab

To concatenate multiple strings together, we run into the infamous callback-hell problem:

callbacks.lua

concat("a", "b", function(ab)
	concat(ab, "c", function(abc)
		concat(abc, "d", function(abcd)
			vim.print(abcd)
		end)
	end)
end)

Many languages like JavaScript solve this by introducing async/await keywords, but Lua supports a different concurrency pattern: asymmetric coroutines.

§
Coroutines

Coroutines in Lua are quite powerful and can be used to implement various patterns: iterators, generators, channels. The best way to learn about these patterns is to read Coroutine Basics and later sections from Programming in Lua by Roberto Ierusalimschy.

However, I think it's preferable not to dive into too much detail because they are neither an intuitive nor effective concurrency primitive (as in, the control flow is implicit and it's easy to misuse). Everything we will need can be summarized as the following coroutine operations:

coroutine.resume(coroutine.create(func)): create and start a coroutine that executes the given function. Note that coroutines are created in a suspended state, which is why we call coroutine.resume here to start its execution.
coroutine.running() -> co: returns a handle to the running coroutine (or raises an error if not called inside a coroutine).
coroutine.yield(co) -> ...: suspends the execution of co and returns the arguments passed in the next call to coroutine.resume.
coroutine.resume(co, ...): resumes the execution of co, passing the arguments to the invocation of coroutine.yield that suspended this coroutine.

The following snippet demonstrates, in the simplest possible way, how coroutines interact with async callbacks:

coroutines.lua

local function concat_co(left, right)
	local co = coroutine.running()
	concat(left, right, function(result)
		coroutine.resume(co, result)
	end)
	return coroutine.yield()
end

coroutine.resume(coroutine.create(function()
	local ab = concat_co("a", "b")
	local abc = concat_co(ab, "c")
	local abcd = concat_co(abc, "d")
	vim.print(abcd)
end))

Still, you may be wondering how coroutines can possibly resume themselves from inside a callback created within the coroutine. It's actually not that complicated. After the yield, the call to coroutine.resume(coroutine.create(...)) returns and the current context of execution is finished. Neovim's main thread is free to poll the event loop for new things to do.

Later, it receives an event that indicates the vim.defer_fn delay has passed, and it calls into the Lua runtime to invoke the associated callback function. This callback calls coroutine.resume, which switches back to the coroutine execution context and resumes execution after the last call to coroutine.yield, continuing until another call to coroutine.yield, or the end of the coroutine.

Unfortunately, there are some issues with this pattern:

Errors must be handled in the context they are raised in, which means that it is not enough to wrap the main function in pcall, you have to use pcall in between each yield. Additionally, errors do not survive across yields (they contain stack trace info and such that are only valid in the execution context they were created in).
You cannot call an async function from a sync context because the operation of creating the async function and awaiting the result is fused into one, which makes some patterns like joining difficult to implement.
Manual coroutine management must be embedded in every async function, which makes some patterns more difficult to implement, like awaiting multiple results without resorting to sub-coroutines.

Fortunately, JavaScript already provides a solution in the form of a higher-level concurrency primitive: promises.

§
Promises

The promise of promises is to encapsulate coroutine management in an API that looks like regular callbacks, with a clean separation between sync and async code. They also make error handling explicit with a dedicated reject function.

Let's not dive immediately into the implementation and instead show what a hypothetical promise-based API would look like:

promises.lua

local function concat_async(left, right)
	return Promise.new(function(resolve, _reject)
		concat(left, right, function(result)
			resolve(result)
		end)
	end)
end

coroutine.resume(coroutine.create(function()
	local ab = concat_async("a", "b"):await()
	local abc = concat_async(ab, "c"):await()
	local abcd = concat_async(abc, "d"):await()
	vim.print(abcd)
end))

Technically, concat_async is synchronous; it only returns a promise that we can await to get the result. This means that the implementation must consider that the callback may have already been invoked at the time we call await (in which case we don't want to suspend the coroutine and we return the result immediately).

promises.lua

local Promise = {}
Promise.__index = Promise

function Promise.new(fn)
	local self = setmetatable({}, Promise)

	self._done = false
	self._result = nil
	self._error = nil
	self._notify = function() end

	local resolve = function(...)
		if self._done then
			return
		end
		self._result = ...
		self._done = true
		self._notify()
	end

	local reject = function(err)
		if self._done then
			return
		end
		self._error = err
		self._done = true
		self._notify()
	end

	fn(resolve, reject)

	return self
end

function Promise:await()
	local co = coroutine.running()

	if not self._done then
		self._notify = function()
			coroutine.resume(co)
		end
		coroutine.yield(co)
	end
	assert(self._done)

	if self._error ~= nil then
		error(self._error)
	end
	return self._result
end

The implementation of await is similar to the manual implementation of an async function using coroutines, even a bit simpler in the sense that coroutines are only used as a notification mechanism and not to transfer the result.

Contrary to Rust futures that need to be awaited to do something, these promises start immediately, which allows for interesting concurrency patterns:

promises.lua

coroutine.resume(coroutine.create(function()
	local ab_promise = concat_async("a", "b")
	local bc_promise = concat_async("b", "c")
	local abcd = concat_async(ab_promise:await(), bc_promise:await()):await()
	vim.print(abcd)
end))

Here the computation of "ab" and "bc" is concurrent, thereby reducing the overall execution time by 1 second. As an exercise, you could implement an equivalent of JavaScript's Promise.all to await any number of promises at once.

§
Next steps

Another important operation that is typically more difficult to implement is cancellation. It can occur at three different levels:

Cancelling an async function requires the underlying operation to support that. For instance, async LSP requests return an ID that can be used to cancel them. A possible implementation is to add a Promise.cancel method that invokes a user-defined callback. Calling await after that should return a "cancelled" error.
Cancelling await based on a timeout. A straightforward implementation could just call cancel with a "timeout" error, using vim.defer_fn for delayed execution.
Cancelling an entire coroutine is harder to implement because promises can be created inside a coroutine and awaited in another, so it's not obvious which operations should be cancelled or not. The entire async operation could be wrapped inside a cancellable promise, or one could also draw some inspiration from Go's Context.

Issue #19624 in the Neovim repository outlines several ideas for a canonical asynchronous library (or at the very least, links to various existing implementations).

§
Acknowledgement

The simple coroutine approach described in this article was inspired by Using coroutines in Neovim Lua written by Grzegorz Milka. He extends the concept further by generalizing the transformation of synchronous functions into asynchronous ones. He also developed coop.nvim, a library that puts these ideas into practice.

§Callbacks

§Coroutines

§Promises

§Next steps

§Acknowledgement

§
Callbacks

§
Coroutines

§
Promises

§
Next steps

§
Acknowledgement