Quick Tour of The Concurrent Futures Library

Threading in Python often gets a bad rap, however the situation around threading has gotten a lot better since the Concurrent Futures library was introduced in version 3.2 of Python. Python threads however will only give you an increase in performance in certain circumstances, firstly it is only recommended to use threads for IO bound tasks. The reason for this is to complicated to go into here, but are related to the workings of Python’s GIL (Global Interpreter Lock) Those looking to improve performance of CPU heavy tasks instead need to make use of multiprocessing. In fact, the concurrent futures library provides same interface for working with both threads and processes. The code in this post will focus on using the library with threads but many of the same patterns can be applied to code making use of a process pool.

Thread Pools

A core part of the concurrent futures library is the ability to create a thread pool executor. Thread pools make it much easier to manage a bunch of threads. We simple create an instance of a thread pool and set the number of threads we want to use. We can then submit jobs to be run the thread pool. This first example just shows us how to submit a job to a thread, however at this point we have no way to work with the results.

Futures – As Completed

We are going to begin looking at the as completed pattern. This allows to submit a bunch of different tasks to our thread pool and retrieve the results as the tasks have been completed. This construct can be very handy if we want to do a bunch of blocking IO tasks and then process the results once all the tasks have been completed. In the above example, we use the get_page function as before which has be omitted for brevity.

Here we simply submit our simple task to the thread pool executor we can then wait for all of our submitted tasks to be completed. We can also set an optional timeout which will see our tasks timeout if they take too long.

Futures – Mapping

We can also use the thread pools executor’s map function to take a group of tasks and then map these to the different threads in our thread pool. The first argument to the call is the function in question and then a iterable of arguments to be passed into the function this then returns us a list of futures. Getting results from this relatively easy, and we can get the final results of the tasks in question by simply calling list over our iterable of futures objects. This then gives us a list of results which can work with the list of results just as if they had been returned from a normal function.

Futures – Callback

Callbacks provide Python users one of the most powerful methods for working with thread pools. With a callback we can submit a task to our thread pool and then call add_done_callback to the future object returned from our thread pool submission. What is more tricky is that the callback takes only one argument which is the result of the future. We can then perform various actions to the future result, such as checking for whether an exception was thrown or whether the future was cancelled before it was able to complete it’s task. We can then finally handle  and process the result of the future. This allows for some more complicated concurrent programming patterns with the callbacks feeding a queue of additional jobs to be processed.

Leave a Reply

Your email address will not be published. Required fields are marked *