Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.541

Mining proteomic data for biomedical research

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract The popularity of proteomics in biomedical research has grown with the development of advanced measurement technologies. This has enabled high‐throughput protein expression profiling, modification‐specific proteomics, and global protein–protein interaction maps. Although proteomics has great potential in providing deeper understanding of the role of individual proteins and protein networks in disease and in unveiling the underlying disease mechanisms, challenges arise in transforming the large‐scale experimental data into biomedical knowledge for clinical practice and drug development. In particular, sophisticated computational tools are required to interpret the high‐dimensional proteomic datasets that typically reflect not only biological information, but also technical biases and limitations. This review gives an overview of the role of data mining in biomedical applications of proteomics, with a focus on data from mass spectrometry‐based expression profiling studies. © 2011 Wiley Periodicals, Inc. This article is categorized under: Algorithmic Development > Biological Data Mining

Application of biomarkers at different stages of disease development. Various genetic and environmental factors can contribute to disease, and the disease may develop long before the onset of clinical symptoms. Biomarkers would be useful to predict disease risk and to diagnose the disease earlier, as well as to follow the disease progression and the efficacy of medical treatment.

[ Normal View | Magnified View ]

Visualization of the PPIs for the proteins identified as differentially expressed in a comparison between severe burn female and male patients using the reproducibility‐optimized test statistic ROTS (see Ref 52 for details of the differential expression analysis). The network was generated using the Protein Interaction Network Analysis (PINA) tool on the identified 11 differentially expressed proteins with UniProt/SwissProt Accession (green nodes) as query proteins. Yellow nodes indicate proteins that interact with at least two of the query proteins. Lines indicate interactions.

[ Normal View | Magnified View ]

A comparison between the result of a statistical test and reality. Two types of errors may occur: a protein that is erroneously declared as differentially expressed (false positive), or a truly differentially expressed protein that is not detected (false negative). Instances in which the test result matches the reality are called true positive or true negative.

[ Normal View | Magnified View ]

Effect of the characteristics of the data on the performance of different protein ranking statistics. In the upper figure, data were generated from a normal model, where both differentially and equally expressed proteins had similar variances (model M1 in Ref 53), whereas in the lower figure, differentially expressed proteins had higher variances (model M2 in Ref 53). In the former case, the ordinary t‐test performed better than the fold change, whereas in the latter case, the result was the opposite. In both cases, the reproducibility‐optimized test static (ROTS) showed the best performance.

[ Normal View | Magnified View ]

A generic workflow of a clinical protein profiling study. Careful experimental design is a prerequisite for a successful study. After preprocessing and quality control of the raw experimental data generated from the clinical samples, various data mining techniques are applied to identify a set of candidate biomarkers for further validation. Finally, an optimal set of validated biomarkers is determined for the construction of a clinical assay.

[ Normal View | Magnified View ]

Related Articles

Drug Discovery: An Interdisciplinary View
Proteomics: An Interdisciplinary View

Browse by Topic

Algorithmic Development > Biological Data Mining

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts