Performant Python with Intel MKL15 October 2014
ARCCA have recently been helping a research group at Cardiff University to benchmark their code. The code was based on Python and had many options to perform FFT analysis. One particular option was to use the Intel MKL directly within the Python code. This was particularly useful since the MKL would have had optimisations for the newest processor types and therefore the benchmark should be a fairer reflection what speed improvements could be achieved.
What is the MKL?
The MKL (or Math Kernel Library) is provided by Intel (see: MKL website) and provides an optimised version of code to common tasks such as operations on matrices as found in LAPACK and solutions to FFT analysis similar to libraries such as FFTW.
The MKL has been through a number of versions – latest at time of writing being version 11.2 (note the different numbering scheme compared to Intel Compilers). The different versions can make a big difference so it is always worth reading the MKL release notes to make sure the version you are using is most appropriate to your platform and MKL function you are using. For example AVX2 optimisations were introduced recently to FFT section of the MKL which makes a big difference on newer hardware.
Calling MKL from Python
Python is a very flexible language and due to its tight integration with C, Python can call C libraries with relative ease. For example see where it is a matter of just:
from ctypes import * # Load the MKL library mkl = cdll.LoadLibrary("./libmkl_rt.so") # Point to the function we want to use cblas_dgemm = mkl.cblas_dgemm # Create all the variables # <insert setting up data here> ... # Call the function cblas_dgemm( c_int(Order), c_int(TransA), c_int(TransB), c_int(m), c_int(n), c_int(k), c_double(alpha), byref(a), c_int(lda), byref(b), c_int(ldb), c_double(beta), byref(c), c_int(ldc))
Alternative other routes
Alternatively there are now instructions from Intel to use the MKL directly in the popular Scipy/Numpy Python packages. See Intel article at Numpy with MKL. This could really help get the most from your processor.
Whats the catch…
As with any vendor provided solution you only get the most out of it with Intel processors but it can improve things on non-Intel platforms as well. Also as with any numerical library you may find the results from using these libraries differ from your existing solution and since the source code is unavailable you may feel uncomfortable performing science where results are based on a vendor supplied library.
Taking all this into account its well worth investigating whether the use of MKL would allow you to have fast Python code without the pain of writing it yourself.