Welcome to the ADMT Publication Server

Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis

DocUID: 2017-003 Full Text: PDF

Author: Vineet K. Raghu, Xiaoyu Ge, Panos K. Chrysanthis, Panayiotis V. Benos

Abstract: The exponential growth of high dimensional biological data has led to a rapid increase in demand for automated approaches for knowledge production. Existing methods rely on two general approaches to address this challenge: 1) the Theory-driven approach, which utilizes prior accumulated knowledge, and 2)the Data-driven approach, which solely utilizes the data to deduce scientific knowledge. Both of these approaches alone suffer from bias toward past/present knowledge, as they fail to incorporate all of the current knowledge that is available to make new discoveries. In this paper, we show how an integrated method can effectively address the high dimensionality of big biological data, which is a major problem for pure data- driven analysis approaches. We realize our approach in a novel two-step analytical workflow that incorporates a new feature selection paradigm as the first step to handling high-throughput gene expression data analysis and that utilizes graphical causal modeling as the second step to handle the automatic extraction of causal relationships. Our results, on real-world clinical datasets from The Cancer Genome Atlas (TCGA), demonstrate that our method is capable of intelligently selecting genes for learning effective causal networks.

Published In: The 2nd International Workshop on Health Data Management and Mining

Pages: 1525-1532

Year Published: 2017

Project: Gene Feature Selection Subject Area: Data Exploration, Knowledge Production

Publication Type: Workshop Paper

Sponsor: U54HG008540, R01LM012087, T32CA082084

Citation:Text Latex BibTex XML Vineet K. Raghu, Xiaoyu Ge, Panos K. Chrysanthis, and Panayiotis V. Benos. Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis. The 2nd International Workshop on Health Data Management and Mining. 1525-1532. 2017.