Benchmarking BLAS Part 1: Building BLASbench with gcc

I recently installed ATLAS on my new workstation and I’ve been wondering what kind of performance gain I would get compared to the reference BLAS. After some searching, I finally found a BLAS benchmarking tool called BLASbench, which is part of the LLCbench suite. This tool has a rather strange build process, so follow the instructions on the web page.  You have to create a file called sys.def which contains build instructions for your system. 
Here’s the sys.def file I used for my Gentoo system:

# Linux-mpich sys.def
# Blasbench values
BB_CC = gcc
BB_F77 = gfortran
BB_LD = gcc
BB_CFLAGS = -O3 -Wall -DREGISTER -DINLINE
BB_LDFLAGS = $(BB_CFLAGS)
BB_LIBS = -lblas -lrt
# Cachebench values
CB_CC = $(BB_CC)
CB_CFLAGS = -O -Wall
CB_LDFLAGS = $(CB_CFLAGS)
CB_LIBS = -lrt
# MPbench values
MP_MPI_CC = mpicc
MP_CFLAGS = $(BB_CFLAGS)
MP_LIBS = -lrt
MPIRUNCMD = mpirun
MPIRUNPROCS = -np
MPIRUNPOSTOPTS = mpi_bench

Note that I haven’t messed with the cache or MPI benchmarking, so those values may be wrong.  Initially, I kept getting this linking error when I tried to build it:

gcc -O3 -Wall -DREGISTER -DINLINE -o vblasbench bb.o flushall.o timer.o  -lblas -lrt
bb.o: In function `MAIN__':
bb.c:(.text+0xe): undefined reference to `s_stop'
collect2: ld returned 1 exit status
make[1]: *** [vblasbench] Error 1

Most of the Google hits for “undefined reference to `s_stop'” talked about linking issues with g77 vs gfortran.  Since this is an brand new installation that has never had g77, that couldn’t be my issue.  I solved the problem by deleting lines 109-127 from the file blasbench/bb.c. The deleted lines are shown below:

#if defined(ia64)
#else
/* Entry points to fool fortran linkers...*/
#ifdef __linux__
int MAIN__()
#endif
#if defined(__hppa) || defined(_HPUX_SOURCE)
int __main()
#endif
#if defined(__linux__) || defined(__hppa)
{
#if defined(__linux__) && defined(__GNUC__)
/* Subroutine */ int s_stop();
s_stop("", 0L);
#endif
return(0);
}
#endif
#endif

This looks like a hack that was added to work around issues with g77, but now it creates problems for modern compilers and linkers. After deleting that block of code, I was able to build the code successfully.
In Part 2, I’ll explain how to run BLASbench and compare the results I got for the reference BLAS implementation, BLAS-ATLAS and BLAS-ATLAS with threads.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.