Basic Introduction to Cython

Python is often criticized for being slow. In many cases pure Python is fast enough, there are certain cases where Python may not give you the performance you need. In recent years a fair number of Python programmers have made the jump to Golang for performance reasons. However there are a number of ways you can improve the performance of your Python code such as using PyPy, writing C extensions or trying your hand at Cython.

What is Cython?

Cython is a superset of the Python language. This means that the vast majority of Python code, is also valid Cython code. Cython allows users to write Cython modules which are then compile and can be used within in Python code. This means that users can port performance critical code into Cython and instantly see increases in performance. The great thing about Cython code is that you can determine how much to optimize your code.  Simply copying and compiling your Python code might see you make performance gains of 8-12%, whereas more serious optimization of your code can lead to significantly better performance.

Installing Cython

Installing Cython on Linux is very easy to do and just requires you to use the ‘pip install cython’ command. Those on Windows devices will likely have a much tougher time, with the simplest solution seeming to be just installing ‘Visual Code Community’ and selecting both the C++ and Python support options. You can then just install Cython like you would any other Python package.

Using Pure Python

We are going to begin with compiling a pure Python function. This is a very simple task, and can achieve some limited performance benefits, with a more noted increase in performance for functions which make use of for and while loops. To begin we simply save the below code into a file called ‘looping.pyx’.

This very simple function takes a list of numbers and then multiples each number by it’s index, and returning the sum of the results. This code is both valid Python and Cython code. However, it takes no advantage of any Cython optimizations other than the compilation of the code into C.

We run the below command to create Cython module which can be used in Python:

We can then import our Cython module into Python in the following manner:

What kind of performance benefits can we expect from just compiling this Python code into a Cython module?

I ran some tests and on average the Cython compiled version of the code took around 10% less time to run over a set of 10,000 numbers.

Adding Types

Cython achieves optimization of code by introducing typing to Python code. Cython supports both a range of Python and C types. Python types tend to be more flexible but give you less in terms of performance benefits.  The below example makes use of both C and Python types, however we have to be very careful when using C types. For instance we could throw an overflow error should the list of numbers we pass in be too large and the result of the multiplication being to large to store in a C long.

As you can see we use the Python type ‘list’ to type annotate our input list. We then define two C types which will be used to store the length of our list and our output. We then loop over our list in exactly the same way as we did in our previous example. This shows just how easy it is to start adding C types to code with the help of Python. It also illustrates how easy it is to mix both C and Python types together in one extension module.

This hybrid code when tested was between 15-30% faster than the pure Python implementation without taking the most aggressive path of optimization and turning everything into a C type. While these savings may seem small, they can really add up on operations which are repeated hundreds of thousands of times.

Cython Function Types

Unlike standard Python, Cython has three types of functions. These functions differ in how they are defined and where the can be used.

  • Cdef functions – Cdef functions can only be used in Cython code and cannot be imported into Python.
  • Cpdef functions – can be used and imported in both Python and Cython. If used in Cython they behave as a Cdef function and if used in Python they behave as if they are standard Python function.
  • Def functions – are like your standard Python functions and can be used and imported into Python code

The below code block demonstrates how each of these three function types can be defined.

This allows you to define highly performant Cdef functions for use within Cython modules, while at the same time allowing you to write functions that are totally compatible with Python.  Cpdef functions are a good middle ground, in the sense that when they are used in Cython code they are highly optimized while remaining compatible with Python, should you want to import them into a Python module.

Conclusion

While this introduction only touches the surface of the Cython language, it should be enough to begin optimizing code using Cython. However, some of the more aggressive optimizations and the full power of C types are well beyond the scope of this post.

Leave a Reply

Your email address will not be published. Required fields are marked *