A lookup table for fast Python math

Numerical programming frequently requires the use of look-up tables. A look-up table is a collection of pre-computed values. When given an “x” value, the table returns a pre-computed “y” value. Look-up tables can be used to speed up numerical codes, when it is faster to look up a value in the table than it is to compute the value. They are also used when the data in the table cannot be computed–for example, experimental data or averaged results from an ensemble of Monte Carlo simulations. Another application is to compute a value when a function cannot be solved algebraically. Assume that you have a formula for a function q(h). You need the value of h for a given value of q, but the formula cannot be algebraically solved to get h(q). Instead, choose a range of h values, compute the function q(h), and store each value in a look-up table. Now you can get h(q) for any value stored in the table. The major limitation of a look-up table is that it cannot return valid results for any value of q which is outside the range of those stored in the table. Depending on its implementation, the table may be able to interpolate to return values between known points.

Scipy includes two handy lookup table classes, but they are sort of hidden. Here is an example of how to create a lookup table using interp1d and UnivariateSpline:

from scipy.interpolate import interp1d, UnivariateSpline
from scipy.stats import norm
from numpy import arange
from numpy.random import uniform
import matplotlib.pyplot as plt
from time import time

deltax = 0.1
xmin = -5.0
xmax = 5.0

# Set up samples
x_samples = arange(xmin, xmax, deltax)
pdf_samples = norm.pdf(x_samples)
fig = plt.plot(x_samples, pdf_samples, 'ro', label='Sampled')

# Interpolate on a finer grid
pdf_interp = interp1d(x_samples, pdf_samples, kind='linear')
x_interp = arange(xmin+deltax, xmax-deltax, 0.001)
plt.plot(x_interp, pdf_interp(x_interp), 'b-', label='Interpolated')

# Do the same thing with UnivariateSpline
u = UnivariateSpline(x_samples, pdf_samples, k=1, s=0.0)
plt.plot(x_interp, u(x_interp), 'k-', label='Spline')

plt.xlabel('x')
plt.ylabel('pdf')
plt.legend()
plt.show()

# Time tests
start_time = time()
for i in range(100000):
    pdf_interp(uniform(xmin+deltax, xmax-deltax))
print("interp1d run time: {}".format(time() - start_time))

start_time = time()
for i in range(100000):
    u(uniform(xmin+deltax, xmax-deltax))
print("UnivariateSpline run time: {}".format(time() - start_time))

x_fine = arange(xmin + deltax, xmax - deltax - 0.001, 1e-6)

start_time = time()
pdf_interp(x_fine)
print("interp1d run time: {}".format(time() - start_time))

start_time = time()
u(x_fine)
print("UnivariateSpline run time: {}".format(time() - start_time))

First, create a Numpy array to store the x values for the lookup table. I’m using the pdf of the normal distribution as an example of a function to interpolate. The table uses linear interpolation to compute values between the known points. Higher-order interpolations can be used, but I don’t need them in this case. Higher-order interpolations can introduce distortions for certain types of data, so they are best avoided unless you really understand the data being fitted. Below is a plot illustrating the results of a look-up table:

SciPy interpolation example
SciPy interpolation example

The red dots are the values stored in the lookup table, the blue line is interpolated using interp1d, and the black line is interpolated using UnivariateSpline (the blue and black lines are indistinguishable). The function was approximated with a lookup table that was created with an evenly spaced array of x values. Notice how the points stored in the table are spaced more closely together in the y dimension for values of x further from the origin. Thus, the approximation becomes less accurate as x approaches 0, which is a consequence of using a linear “x” spacing with a nonlinear function. You could work around this by using a non-uniform array of “x” values to create the lookup table. This is another reason to understand the function you are approximating, instead of blindly placing values into the table.

Peformance

Like all operations in NumPy, interpolation runs much faster if you pass in a Numpy array instead of operating on individual values. On my older Mac laptop, UnivariateSpline is about twice as fast as interp1d when operating on individual floating point numbers, but interp2d is about twice as fast when operating on NumPy arrays:

interp1d run time: 3.5505311489105225
UnivariateSpline run time: 1.5044260025024414
interp1d run time: 0.1413729190826416
UnivariateSpline run time: 0.27219295501708984

Alternatives to SciPy

If you want to understand more about how lookup tables work, or can’t/won’t install SciPy, check out this interpolated lookup table class. I wouldn’t use it for serious math, since it doesn’t use Numpy and is probably not very fast.

2 thoughts on “A lookup table for fast Python math”

  1. Pingback: Optimizing Python code for fast math | as through a mirror dimly

  2. Pingback: Lookup tables and spline fitting in Python | as through a mirror dimly

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.