Reading MS Data Formats

This page concerns information on mass spectrometry datafile formats, in particular those used by old (and perhaps no longer existent) mass spectrometers. On occasion one might need to read information from an old file but without access to the original software and the hardware capable of running it, or at least a description of the format used to store the data, that may not be possible. This has the potential to result in the long-term loss of data through obsolescence, a growing concern that has been termed "data rot". This can be especially relevant for small molecule analyses, since the current focus on data exchange through XML formats is mainly concerned with proteomics with less emphasis on small molecule files.

There are several commercial or free programs available that can read and interconvert current data formats, and some of them are listed in the Other Sources links on this page. However, some of these programs only read raw data files for internal use within the program; they may not provide an accessible output of the original raw, unformatted data. AMDIS for example, can output the TIC from old HP MSD files, but apparently not the individual spectral data.

Several approaches to this problem of obsolete file structures are being pursued. One example is OpenChrom which currently offers both open source and paid (Enterprise) versions supporting many file formats. A link to OpenChrom is provided in the adjacent list. The OpenChrom project appears to be developing into an independent processing package with many features and extensive data analysis capabilities beyond simply reading and writing file formats. User developed plugins should continue to provide expanding file support options into the future. OpenChrom offers some excellent capabilities and might be considered by anyone interested in converting MS datafiles, although support for older formats is typically lacking.

Another approach to the problem of data obsolescence is the documentation of available file structures for obsolete systems. We currently provide on this site code listings in Pascal for reading two formats: the original Hewlett-Packard ChemStation MSD .D file designed to run on the early HP Motorola processor-based systems, and an old Sciex API III .zipd format used for quantitation with Sciex MacQuan on the Macintosh.

In addition, a simple program that can read and output HP MSD data, and that uses the MSD data access code described above, is included on the Free Software page. The HP MSD has enjoyed wide distribution and there are many programs available that can read HP MSD files, some of which are included in the links. We also include a free program that can read and output data from current Agilent Mass Hunter GC/MS and MS/MS files, although those files are obviously current and supported by Agilent. These programs are available on the Freeware page.

We would welcome additional contributions or suggested links from anyone who may know of routines that were written in the past, using any language, for accessing old and mainly obsolete raw MS datafiles. Access to old MS file information may seldom be needed, but when it is, there are likely to be few options available.