I spent some time today profiling a Brownian dynamics simulation written in Python to see how I could make it faster before starting some long runs on a Linux cluster. In the sections below, I have attempted to quantify the speed improvements due to various changes. Keep in mind that the actual speed improvement in your code will vary, depending on where the actual bottlenecks occur. See my post about profiling Python code. Another caveat: I am running Python 2.4.4 because it’s installed on our cluster.
Use SciPy to implement fast lookup tables
Please see my previous post about how to implement fast lookup tables in Python using interp1d and UnivariateSpline. interp1d seems to be faster when operating on a NumpyArray, but UnivariateSpline is faster when operating on individual floating-point numbers. cipy.interpolate.UnivariateSpline wraps a compiled Fortran library (FITPACK), and it’s pretty much a drop-in replacement for interp1d. The one thing it can’t do is return a default value for out-of-range x values.
For scalar math, Python functions are faster than numpy functions
In one method, I realized that I had accidentally imported the sqrt() function from numpy, but I was only using it to compute the square root of scalar floating point numbers. Finding roots is computationally intensive, and from previous experience I suspected that numpy might be slower than the build-in Python math functions, so I knew this might be an easy place to find some speed. To quantify how much speed, I wrote this little test script:
from math import sqrt from time import time num_repeats = 10000000 start_time = time() for i in range(num_repeats): sqrt(1012.1234215) print "math.sqrt() Elapsed time: " + str(time()-start_time) from numpy import sqrt start_time = time() for i in range(num_repeats): sqrt(1012.1234215) print "numpy.sqrt() Elapsed time: " + str(time()-start_time)
It turns out that the sqrt() function from the standard Python math module is about seven times faster than the corresponding sqrt() function from numpy. As a side note, I learned that it is slightly faster (5-10%) to use the form “from math import sqrt” than it is to use “import math” and “math.sqrt()”. Because of this, you may not want to use “from numpy import *” if you are really concerned about speed.
Use Numpy for generating random numbers
Another significant improvement in speed came when I switched from the gauss() function of the built-in random library to the normal() function from numpy.random. Using a similar script to the one above, I found that the numpy version is about four times faster than random.gauss(). However, generating a random number from a uniform distribution (random.uniform) is very fast. In the source code of random, you will note that drawing one random number from a Gaussian distribution requires computing a square root, a log, a cosine, and a sine in Python. Numpy is faster because it does the math in C.
Function calls are expensive
A minor, but still useful, speed improvement came when I eliminated some unnecessary calls to the built-in functions max(), min(), and abs(). max(float1, float2) can be replaced with
if num1 > num2: a = num1 else: a = num2
which is about 35% faster. min() and abs() can be similarly replaced with if/then/else statements.
Eliminate unnecessary import statements
When you are wrapping up development of a module, go back and delete all the “from some_module import some_unnecessary_function” statements. This is marginally helpful with speed, but it’s a good programming practice.