My research group uses Labview 7.1 to write custom data acquisition (DAQ) software. I code everything else in Python, so I need to get data from Labview into Python for processing. Our DAQ program produces Labview binary files, so I had to find a way to read them with Python. Binary files are nice because they are a compact way to store numerical data as compared to ASCI or (heaven forbid) XML, but they are much harder to read. The binary format used by Labview is documented only indirectly, so I had to hack a little.
The first thing to realize is that the Labview binary file is a direct dump of the data that was stored in RAM. How Labview stores data in memory is documented here. Indirectly, this documents how binary files are stored on disk. Our DAQ program writes a rather complex “cluster” (Labview’s version of a C structure) to disk. The elements of the cluster are stored contiguously as a sequence of bytes, and there’s no way to know which byte goes with which element, unless you know the size of each element and the order in which they are stored in the cluster. So, the first step is to document the cluster that’s being written to disk. You can use the context help in Labview to view the data type of the wire that leads to the VI that writes the file. With this in hand, you are ready to write Python code.
First, make sure you open the file in binary mode:
binaryFile = open("Measurement_4.bin", mode='rb')
Now you can use the file.read(n) command to read n bytes of data from the file. This data is read as a string, so you will need the struct module to interpret the string as packed binary data. This is what it looks like:
(data.offset,) = struct.unpack('>d', binaryFile.read(8))
The first argument to struct is a format string. The ‘>’ tells struct to interpret the data as big-endian (the default for Labview) and the ‘d’ tells it to produce a data type of double. The data type has to match the number of bytes read in by file.read(). Remember that we’re reading the file sequentially, and there are no delimiters, so if you read in the wrong number of bytes for any data, all the subsequent data will be junk.
One final note about arrays: arrays are represented by a 32-bit dimension, followed by the data. If the array contains no elements, it is stored as 32 zero bits with no other data. It would probably be a good idea to use the Python array module if you need to read in large arrays efficiently.
Pingback: Fun with threads in Python and wxPython | as through a mirror dimly
Thanks for sharing your experience. If you are dealing with LabView data you should really be using NumPy which is the standard N-dimensional array module for Python.
With NumPy you can use it’s excellent facilities for reading binary data into efficient memory structures that can be easily accessed using Python syntax:
Something like
data = numpy.fromfile(“Measurement_4.bin”, dtype=”>d”)
will give you the data in a NumPy array. This array can be plotted or visualized using any number of tools (Chaco, matplotlib, etc.).
Try out a distribution of Python like the Enthought Python Distribution to get the complete assortment of Python tools that let you interact easily with binary data no matter what the source.
You may also be interested in using Memory mapped files to access portions of a file very quickly. See the corresponding public webinar on this page for details.
http://www.enthought.com/trainin /SCPwebinar.php
Hey Travis–thanks for reading, and thanks for Numpy! I will have to look into your suggestions and write an updated post, since this page seems to be quite popular. We are no longer using Labview binaries for our data storage, since we have figured out how to save the data in HDF format. The HDF files tend to be somewhat larger, but disk space is cheap and the self-documenting organization of the HDF files is very useful.
Pingback: Primera lista de direcciones » Un atoq en el aula