Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.541

Mining from protein–protein interactions

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Proteins are important cellular molecules, and interacting protein pairs provide biologically important information, such as functional relationships. We focus on the problem of predicting physically interacting protein pairs. This is an important problem in biology, which has been actively investigated in the field of data mining and knowledge discovery. Our particular focus is on data‐mining‐based methods, and the objective of this review is to introduce these methods for data mining researchers from technical viewpoints. We categorize those methods into three types: pairwise data‐based, network‐based, and integrative approaches, each approach being described in a different section. The first section is further divided into five types, such as supervised learning, algorithmic approaches, and unsupervised learning. The second section is mainly on link prediction, which can be further divided into two types, and two subsections that cover topics related with protein interaction networks are further added. The final section provides a wide variety of methods in integrative approaches. © 2012 Wiley Periodicals, Inc. This article is categorized under: Algorithmic Development > Biological Data Mining Application Areas > Industry Specific Applications Technologies > Machine Learning

Physical interactions and physical domain–domain interactions. Two circles in the top are proteins. Proteins are amino acid sequences in which domains (features) are embedded, as shown in the bottom. Two proteins are physically interacting with each other, meaning that domains of these two proteins are interacting with each other.

[ Normal View | Magnified View ]

Three types approaches for predicting physical interactions: (a) pairwise data‐based, (b) network‐based, and (c) integrative approaches. Nodes and edges are proteins and interactions, respectively. In (a), paired proteins (interactions) are given, and features behind proteins (nodes) are also given. Using these two types of inputs, new interactions are predicted. In (b), interactions are given as networks, and the new interactions are predicted as link prediction, where the network topology is considered more than (a). In (c), multiple networks are given, and new interactions are predicted by using all given networks.

[ Normal View | Magnified View ]

Biclique: which has two sets of nodes, where the nodes of one set are fully connected with the nodes of the other set.

[ Normal View | Magnified View ]

Pseudocode of a kernel‐based link prediction method.

[ Normal View | Magnified View ]

Browse by Topic

Technologies > Machine Learning
Application Areas > Industry Specific Applications
Algorithmic Development > Biological Data Mining

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts