Skip Navigation

Journal of Computational Biology

Not a subscriber? Get started...

Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum

To cite this article:
Ryan H. Lilien, Hany Farid, and Bruce R. Donald. Journal of Computational Biology. December 2003, 10(6): 925-946. doi:10.1089/106652703322756159.

Published in Volume: 10 Issue 6: July 5, 2004

Author information

Ryan H. Lilien
Dartmouth Computer Science Department, Hanover, NH 03755; and Dartmouth Medical School, Hanover, NH 03755
Hany Farid
Dartmouth Computer Science Department, Hanover, NH 03755
Bruce R. Donald
Dartmouth Computer Science Department, Hanover, NH 03755; Dartmouth Chemistry Department, Hanover, NH 03755; and Dartmouth Department of Biological Sciences, Hanover, NH 03755

ABSTRACT

We have developed an algorithm called Q5 for probabilistic classification of healthy versus disease whole serum samples using mass spectrometry. The algorithm employs principal components analysis (PCA) followed by linear discriminant analysis (LDA) on whole spectrum surface-enhanced laser desorption/ionization time of flight (SELDI-TOF) mass spectrometry (MS) data and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally efficient; it is noniterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.

Free first page

This paper was cited by:

Importancia de la implantación de la proteómica a nivel asistencial
Ángel San Miguel Hernández
Revista del Laboratorio Clínico. Sep 2011
CrossRef
Dimensionality reduction and main component extraction of mass spectrometry cancer data
Yihui Liu
Knowledge-Based Systems. Aug 2011
CrossRef
On the distance concentration awareness of certain data reduction techniques
Ata Kabán
Pattern Recognition. Feb 2011, Vol. 44, No. 2: 265-277
CrossRef
Peek a peak: a glance at statistics for quantitative label-free proteomics
Katharina Podwojski, Martin Eisenacher, Michael Kohl, Michael Turewicz, Helmut E Meyer, Jörg Rahnenführer, Christian Stephan
Expert Review of Proteomics. Apr 2010, Vol. 7, No. 2: 249-261
CrossRef
An Extended Markov Blanket Approach to Proteomic Biomarker Detection From High-Resolution Mass Spectrometry Data
Jung Hun Oh, P. Gurnani, J. Schorge, K.P. Rosenblatt, J.X. Gao
IEEE Transactions on Information Technology in Biomedicine. Mar 2009, Vol. 13, No. 2: 195-206
CrossRef
Modeling Exopeptidase Activity from LC-MS Data
Bogusaw Kluge, Anna Gambin, Wojciech Niemiro
Journal of Computational Biology. Feb 2009, Vol. 16, No. 2: 395-406
Abstract | Full Text PDF | Supplementary Material | Reprints | Permissions
QSAR and complex network study of the chiral HMGR inhibitor structural diversity
I GARCIA, C MUNTEANU, Y FALL, G GOMEZ, E URIARTE, H GONZALEZDIAZ
Bioorganic & Medicinal Chemistry. Jan 2009, Vol. 17, No. 1: 165-175
CrossRef
Quantitative Proteome–Property Relationships (QPPRs). Part 1: Finding biomarkers of organic drugs with mean Markov connectivity indices of spiral networks of blood mass spectra
Maykel Cruz-Monteagudo, Cristian Robert Munteanu, Fernanda Borges, M. Natália D.S. Cordeiro, Eugenio Uriarte, Humberto González-Díaz
Bioorganic & Medicinal Chemistry. Nov 2008, Vol. 16, No. 22: 9684-9693
CrossRef
Computational Prediction Models for Early Detection of Risk of Cardiovascular Events Using Mass Spectrometry Data
T.D. Pham, Honghui Wang, Xiaobo Zhou, Dominik Beck, M. Brandl, G. Hoehn, J. Azok, M.-L. Brennan, S.L. Hazen, K. Li, S.T.C. Wong
IEEE Transactions on Information Technology in Biomedicine. Sep 2008, Vol. 12, No. 5: 636-643
CrossRef
HP-Lattice QSAR for dynein proteins: Experimental proteomics (2D-electrophoresis, mass spectrometry) and theoretic study of a Leishmania infantum sequence
María Auxiliadora Dea-Ayuela, Yunierkis Pérez-Castillo, Alfredo Meneses-Marcel, Florencio M. Ubeira, Francisco Bolas-Fernández, Kuo-Chen Chou, Humberto González-Díaz
Bioorganic & Medicinal Chemistry. Aug 2008, Vol. 16, No. 16: 7770-7776
CrossRef
Statistical data processing in clinical proteomics
S SMIT, H HOEFSLOOT, A SMILDE
Journal of Chromatography B. Apr 2008, Vol. 866, No. 1-2: 77-88
CrossRef
Proteomics, networks and connectivity indices
Humberto González-Díaz, Yenny González-Díaz, Lourdes Santana, Florencio M. Ubeira, Eugenio Uriarte
PROTEOMICS. Feb 2008, Vol. 8, No. 4: 750-778
CrossRef
How to distinguish healthy from diseased? Classification strategy for mass spectrometry-based clinical proteomics
Margriet M. W. B. Hendriks, Suzanne Smit, Wies L. M. W. Akkermans, Theo H. Reijmers, Paul H. C. Eilers, Huub C. J. Hoefsloot, Carina M. Rubingh, Chris G. de Koster, Johannes M. Aerts, Age K. Smilde
PROTEOMICS. Oct 2007, Vol. 7, No. 20: 3672-3680
CrossRef
Proteomics in clinical prostate research
Magnus Hellström, Sara Jonmarker, Janne Lehtiö, Gert Auer, Lars Egevad
PROTEOMICS – CLINICAL APPLICATIONS. Sep 2007, Vol. 1, No. 9: 1058-1065
CrossRef
Aplicaciones de las técnicas proteómicas en medicina asistencial: situación actual y perspectivas
J CABALLEROVILLARRASO
Revista Clínica Española. Jul 2007, Vol. 207, No. 7: 344-347
CrossRef
A multivariate analysis approach to the integration of proteomic and gene expression data
Ailís Fagan, Aedín C. Culhane, Desmond G. Higgins
PROTEOMICS. Jun 2007, Vol. 7, No. 13: 2162-2171
CrossRef
Statistics for Proteomics: A Review of Tools for Analyzing Experimental Data
Wolfgang Urfer, Marco Grzegorczyk, Klaus Jung
PROTEOMICS. Sep 2006, Vol. 6, No. S2: 48-55
CrossRef
Processing and classification of protein mass spectra
Melanie Hilario, Alexandros Kalousis, Christian Pellegrini, Markus Müller
Mass Spectrometry Reviews. May 2006, Vol. 25, No. 3: 409-449
CrossRef
A machine learning perspective on the development of clinical decision support systems utilizing mass spectra of blood samples
H SHIN, M MARKEY
Journal of Biomedical Informatics. Apr 2006, Vol. 39, No. 2: 227-248
CrossRef
Biomarker discovery by proteomics: challenges not only for the analytical chemist
Peter Horvatovich, Natalia Govorukhina, Rainer Bischoff
The Analyst. Jan 2006, Vol. 131, No. 11: 1193
CrossRef
Bioinformatics approaches in clinical proteomics
Eric T Fung, Scot R Weinberger, Ed Gavin, Fujun Zhang
Expert Review of Proteomics. Dec 2005, Vol. 2, No. 6: 847-862
CrossRef
Classification ensembles for unbalanced class sizes in predictive toxicology
J. J. Chen, C. A. Tsai, J. F. Young, R. L. Kodell
SAR and QSAR in Environmental Research. Dec 2005, Vol. 16, No. 6: 517-529
CrossRef
Finding Cancer Biomarkers from Mass Spectrometry Data by Decision Lists
Jian Liu, Ming Li
Journal of Computational Biology. Sep 2005, Vol. 12, No. 7: 971-979
Abstract | Full Text PDF | Reprints | Permissions
Improving feature detection and analysis of surface-enhanced laser desorption/ionization-time of flight mass spectra
Scott M. Carlson, Amir Najmi, John C. Whitin, Harvey J. Cohen
PROTEOMICS. Jul 2005, Vol. 5, No. 11: 2778-2788
CrossRef
Proteomics and the Analysis of Proteomic Data: An Overview of Current Protein-Profiling Technologies
Erol E. Gulcicek, Christopher M. Colangelo, Walter McMurray, Kathryn Stone, Kenneth Williams, Terence Wu, Hongyu Zhao, Heidi Spratt, Alexander Kurosky, Baolin Wu
. Jul 2005
CrossRef
Ovarian cancer identification based on dimensionality reduction for high-throughput mass spectrometry data
J. S. Yu, S. Ongarello, R. Fiedler, X. W. Chen, G. Toffolo, C. Cobelli, Z. Trajanoski
Bioinformatics. May 2005, Vol. 21, No. 10: 2200-2209
CrossRef

Users who read this article also read

full access
Myong-Hee Sung, Richard Simon
Journal of Computational Biology. January 2004: 125-145.
Abstract | Full Text PDF | Reprints | Permissions
full access
R. W. Adamiak, J. Blazewicz, P. Formanowicz, Z. Gdaniec, M. Kasprzak, M. Popenda, M. Szachniuk
Journal of Computational Biology. January 2004: 163-179.
Abstract | Full Text PDF | Reprints | Permissions
no access
André A. Neves, Nick Medcalf, Kevin Brindle
Tissue Engineering. February 2003: 51-62.
Abstract | Full Text PDF | Reprints | Permissions
no access
Gennaro Cormio, Giuseppe Putignano, Giuseppe Di Gesu', Francesco Caradonna, Damiano Cramarossa, Giuseppe Loverro, Luigi Selvaggi
Journal of Gynecologic Surgery. December 2002: 145-148.
Abstract | Full Text PDF | Reprints | Permissions
full access
Rainer Spang, Marc Rehmsmeier, Jens Stoye
Journal of Computational Biology. October 2002: 747-760.
Abstract | Full Text PDF | Reprints | Permissions
full access
Andrew B. Goryachev, Pascale F. Macgregor, Aled M. Edwards
Journal of Computational Biology. September 2001: 443-461.
Abstract | Full Text PDF | Reprints | Permissions

Sign up for TOC Alerts


Publication Tools

  • Related articles in Liebert Online

Search:

for

Authors:

Keywords:

Go to Advanced Search