- Published on
Be Careful Not to Hog the Eventloop in Python
- Authors
- Name
- Yair Mark
- @yairmark
I have been integrating with several libraries in Python to use web services and sockets to get information from different sources.
This information needed to come in as its available. As a result, using websockets (where possible) made sense.
I kept running into an issue where only one of the many async functions I was using was being hit. To understand what was happening I dove deeper into what happens with async in Python by reading this excellent article. Based on this I understood how the event loop works and how to run different coroutines can be run in parallel.
To understand the problem I ran into let's see how we can run async functions in parallel in Python:
Running Multiple Async Coroutines in Parallel
To run multiple async
functions in parallel you have to make sure that:
- You defined these functions as coroutines.
- You create tasks out of these.
- You use
asyncio.gather
on these tasks.
Defining Functions as Coroutines
Defining a coroutine is fairly simple in Python. It looks as below:
async def my_async_function_1():
# do whatever async things you need to here
# Use await where necessary when waiting on some server call or other IO
async def my_async_function_2():
# do whatever async things you need to here
# Use await where necessary when waiting on some server call or other IO
Creating Tasks from Coroutines
We then turn this coroutine into a task using the helper methods in asyncio
:
# ...
import asyncio
tasks = [asyncio.create_task(my_async_function_1()), asyncio.create_task(my_async_function_2())]
Note in the above how we call the coroutine and do no simply pass the function pointer.
The reason for this is because calling an async
function without await
ing it will return an awaitable coroutine object and not execute the function.
Gathering and Running the Tasks in Parallel
Finally we gather
the tasks and run them together:
# ...
async def main():
await asyncio.gather(*tasks)
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
Gotchas
Trying to Await Something that is not async
One thing that caught me out with this is that one of the libraries I was using had websocket integration but the implementation it used was not async
. I was await
ing it but as the underlying library was not async
my await was useless - this library ended up hogging the event loop.
When using async
a single event loop is used to manage the execution of different async
functions. It is not parallel instead when some expensive IO operation is in progress await
tells the event loop manager that it is busy and can carry on processing another async
function. When that function awaits the manager moves on to the next async function.
As IO operations take long the CPU is not forced to sit waiting and instead can process some other async function as specified by the event loop manager. In my case, the issue I was having is that my library got a hold of the event loop and did not let go. As this library was not async, calling await against it was ignored. The library I was using ended up hogging the event loop and nothing else could execute as a result. I fixed this by integrating with the target endpoint's websocket API myself in an async manner.
Forgetting to Await Something that is Async
This is an easy one to do. You simply forget to await some_async_function()
. Luckily this is easy to spot when you hit this piece of code a warning will pop up in your logs. If you use Pycharm this will also be highlighted in yellow for you.