Post

Dive into Python's asyncio, part 1

Concurrency was not seriously taken into account in Python when it was designed. Until 3.4 version, there were two options:

Although these two modules provided programmers with handy primitives and API, they both have considerable downsides. Due to GIL presence, threaded code in Python never actually run in parallel. So all attempts to leverage multiple cores were bound to fail.

On the other hand, programs using multiprocessing could easily use all the CPU power available. Nevertheless, concurrency with seperate processes is heavy solution, more memory-consuming and limited if we talk about possible independent execution units operating in parallel. Need for synchronization means also additional burden for OS, because processes share no memory and one must use OS-provided mechanisms for sending messages between processes.

Let's not write them off, though. These modules still may become very useful in particular situations.

Writing software for contemporary web requires higher level of concurrency that Python is able to provide. Or at least it was until version 3.4, when shiny, new toy was introduced - asyncio module. Two minor versions later, when 3.6 came out we were finally assured about asyncio API's stability.

Starting with Python 3.6 the asyncio module is no longer provisional and its API is considered stable.

Asyncio enables to write highly concurrent code, that can cheaply switch context when waiting for I/O operation. This is possible due to simple observation, that subprogram does not need CPU while it waits for some data to come over the network. In other words, asyncio provides better utilization of CPU in I/O bound applications. It also means that there is no sense in using asyncio for CPU heavy calculations.

Python 3.5 simplified and beautified syntax by introduction new keywords like async and await. I believe these two are borrowed from C#.

One of the simplest example I can think of is to make several HTTP requests in parallel:

import asyncio
import aiohttp


async def example():
    async with aiohttp.ClientSession() as session:
        results = await asyncio.gather(   # 3
            get('http://httpbin.org/user-agent', session),
            get('http://httpbin.org/headers', session),
            get('http://httpbin.org/cookies', session)
        )
        print(results)


async def get(url, session):
    async with session.get(url) as resp:
        return await resp.text()


loop = asyncio.get_event_loop()  # 1
loop.run_until_complete(example())   # 2
  1. asyncio is based on so-called event loop. More details below
  2. Our program runs as along as event loop is running. We want it to terminate after getting all responses
  3. asyncio.gather takes multiple coroutines or futures and can be awaited until all its arguments finishes.

The event loop is a heart of asyncio. This is where switching context takes place. It is responsible for reacting to finishing I/O operations. There is no difference in asyncio's event loop and concepts staying behind node.js and its uvloop. However, you are unlikely to fall into callback hell when writing asyncio code, because it simply looks like synchronous one. There is no need for passing callbacks.

This is the first post of asyncio series. Another one will be about asyncio's cooperative libraries and designing asyncio apps.

 

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.