Supplement to "Direct maximization of protein identifications
from tandem mass spectra"
Marina Spivak, Jason Weston, Michael J. MacCoss and William
Stafford Noble.
Molecular and Cellular Proteomics. 11(2):M111.012161,
2012.
Mass spectrometry data
Yeast spectra are available here. The
data sets "Yeast 1," "Yeast 2" and "Yeast 3" were used in this
work.
The original worm spectra are not available, but the search results
are stored in SQT format
here: full
set, randomly selected
subset. The corresponding C. elegans database
is here. In
addition to worm proteins, the database contains 4311 E. coli
proteins and 44 common contaminants. Each protein occurs in the
database twice, once in the forward and once in the reverse
orientation. The reversed proteins have IDs of the form
"random_seq_1".
Human spectra are available here. Each set of
spectra is available in both MS2 and compressed MS2 format. Crux
search-for-matches output is stored in SQT format. The human
reference database used for the searches
is here. In addition to human
proteins, the database contains 47 common contaminants. Each protein
occurs in forward and reverse form.
External gold standard
mRNA transriptome data from
F. C. P. Holstege, E. G. Gennings, J. J. Wyrick, T. I. Lee, C. J.
Hengartner, M. R. Green, T. R. Golub, E. S. Lander, and
R. A. Young. "Dissecting the regulatory circuitry of eukaryotic
genome." Cell, 95:717–728, 1998.
Protein tagging data from S. Ghaemmaghami,
W. K. Huh, K. Bower, R. W. Howson, A. Belle, N. Dephoure,
E. K. O'Shea, and J. S. Weissman. "Global analysis of protein
exression in yeast." Nature, 425:737–741, 2003.