Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

Selecting informative genes for discriminant analysis using multigene expression profiles

De
10 pages
Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis. Results In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods. Conclusion We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.
Voir plus Voir moins
BMC Genomics
BioMedCentral
Open Access Research Selecting informative genes for discriminant analysis using multigene expression profiles 1 2 Xin Yanand Tian Zheng*
1 2 Address: RussellInvestments, Tacoma, WA, USA andDepartment of Statistics, Columbia University, New York, NY, USA Email: Xin Yan  XYan@russell.com; Tian Zheng*  tzheng@stat.columbia.edu * Corresponding author
th fromIEEE 7International Conference on Bioinformatics and Bioengineering at Harvard Medical School Boston, MA, USA. 14–17 October 2007
Published: 16 September 2008 BMC Genomics2008,9(Suppl 2):S14
doi:10.1186/1471-2164-9-S2-S14
<supplement><title><p>IEEE7<sup>th</sup>InternationalConferenceonBioinformaticsandBioengineeringatHarvardMedicalSchool</p></title><editor>MaryQuYangJ,ackYYang,HamidRArabniaandYoupingDeng</editor><note>Research</note><url>http://www.biomedcentral.com/content/pdf/1471-2164-9-S2-info.pdf</url></supplement> This article is available from: http://www.biomedcentral.com/1471-2164/9/S2/S14 © 2008 Yan and Zheng; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract Background:Gene expression data extracted from microarray experiments have been used to study the difference between mRNA abundance of genes under different conditions. In one of such experiments, thousands of genes are measured simultaneously, which provides a high-dimensional feature space for discriminating between different sample classes. However, most of these dimensions are not informative about the between-class difference, and add noises to the discriminant analysis. Results:In this paper we propose and study feature selection methods that evaluate the "informativeness" of a set of genes. Two measures of information based on multigene expression profiles are considered for a backward information-driven screening approach for selecting important gene features. By considering multigene expression profiles, we are able to utilize interaction information among these genes. Using a breast cancer data, we illustrate our methods and compare them to the performance of existing methods. Conclusion:We illustrate in this paper that methods considering gene-gene interactions have better classification power in gene expression analysis. In our results, we identify important genes with relative large p-values from single gene tests. This indicates that these are genes with weak marginal information but strong interaction information, which will be overlooked by strategies that only examine individual genes.
Introduction Gene expression data that measure mRNA abundance in samples under different conditions provide a valuable tool for studying the difference between the molecular activities of an organism under these conditions [1,2]. Such a study is usually based on a discriminant analysis of the sample classes (under different "conditions") using
the gene expression profiles observed in the experiments. Because of the large number of genes that are measured in one microarray experiment, a critical step is to select the genes that are informative about the betweenclass differ ence. Such a selection also allows researchers to identify genes that are potentially relevant to the betweenclass dif ference in the molecular activities.
Page 1 of 10 (page number not for citation purposes)
Un pour Un
Permettre à tous d'accéder à la lecture
Pour chaque accès à la bibliothèque, YouScribe donne un accès à une personne dans le besoin