This Title All WIREs
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 7.250

Mining large medical claims database to identify high‐risk patients: The case of antidepressant utilization

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Data mining techniques have been applied to discover knowledge from large observational data sets. In this paper, we focus on mining large medical claims databases to identify high‐risk patients. Patient selection, feature extraction, and feature selection are three important processing steps before popular data mining techniques are successfully applied. Both patient selection and feature extraction require domain knowledge. The episode treatment group methodology is a useful tool for organizing medical claims data. It is used for patient selection and feature extraction in this paper. The specific goal of the study is to identify patients with major depression who have a high risk of receiving inadequate antidepressant medication. A nationwide medical claims database covering a 5‐year period is used for this study. The records of 31,721 high‐risk patients and 50,022 comparison patients were examined for 18 features that include patient demographics, episode factors, and comorbidity factors. After supervised feature selection, three features were selected and analyzed using the classification and regression tree method. The result showed that it is possible to use two of the features (number of non‐antidepressant medications used and average number of claims during an episode of major depression) to identify a group of high‐risk patients. These patients are 2.67 times more likely to have inadequate antidepressant medication than the comparison patients. © 2011 John Wiley & Sons, Inc. WIREs Data Mining Knowl Discov 2011 1 154‐163 DOI: 10.1002/widm.5 This article is categorized under: Application Areas > Health Care Fundamental Concepts of Data and Knowledge > Knowledge Representation

Patient selection procedure for antidepressant utilization study.

[ Normal View | Magnified View ]

Region representing high‐risk patients (not to scale).

[ Normal View | Magnified View ]

The resulting classification and regression tree with nine leaves.

[ Normal View | Magnified View ]

Cross‐validation relative error as a function of tree size.

[ Normal View | Magnified View ]

Feature selection results.

[ Normal View | Magnified View ]

Browse by Topic

Fundamental Concepts of Data and Knowledge > Knowledge Representation
Application Areas > Health Care

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts