This Title All WIREs
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 7.250

Mining flexible‐receptor molecular docking data

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Knowledge discovery in databases has become an integral part of practically every aspect of bioinformatics research, which usually produces, and has to process, very large amounts of data. Rational drug design is one of the current scientific areas that has greatly benefited from bioinformatics, particularly a step, which analyzes receptor–ligand interactions via molecular docking simulations. An important challenge is the inclusion of the receptor flexibility since they can become computationally very demanding. We have represented this explicit flexibility as a series of different conformations derived from a molecular dynamics simulation trajectory of the receptor. This model has been termed as the fully flexible receptor (FFR) model. In our studies, the receptor is the enzyme InhA from Mycobacterium tuberculosis, which is the major drug target for the treatment of tuberculosis. The FFR model of InhA (named FFR_InhA) was docked to four ligands, namely, nicotinamide adenine dinucleotide, pentacyano(isoniazid)ferrate II, triclosan, and ethionamide, thus, generating very large amounts of data, which needs to be mined to produce useful knowledge to help accelerate drug discovery and development. Very little work has been done in this area. In this article, we review our work on the application of classification decision trees, regression model tree, and association rules using properly preprocessed data of the FFR molecular docking results, and show how they can provide an improved understanding of the FFR_InhA‐ligand behavior. Furthermore, we explain how data mining techniques can support the acceleration of molecular docking simulations of FFR models. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 532–541 DOI: 10.1002/widm.46 This article is categorized under: Algorithmic Development > Biological Data Mining Technologies > Data Preprocessing

The preprocessing steps needed to generate appropriate inputs for data mining. (a) The definition of each attribute in the input files. (b) An example of input data mining file for the ethionamide ligand. (c) Intermediate steps in data preparation needed for some data mining techniques. (d) The final data mining inputs for each different mining technique. See text for details.

[ Normal View | Magnified View ]

Induced decision tree for the nicotinamide adenine dinucleotide ligand. The leaf nodes are colored according to the free energy of binding (FEB) classes obtained after discretization. Good and Excellent (G and E) FEB classes are in green. Bad and Very Bad (B and VB) FEB classes are in red. The Regular (R) FEB class is in white.

[ Normal View | Magnified View ]

Example of M5P algorithm output. (a) The final model tree for the nicotinamide adenine dinucleotide ligand. This tree has 11 linear models and 10 nodes. (b) Description of the linear model 1 (LM1) of this model tree.

[ Normal View | Magnified View ]

Related Articles

Drug Discovery: An Interdisciplinary View

Browse by Topic

Technologies > Data Preprocessing
Algorithmic Development > Biological Data Mining

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts