Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
12 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
12 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue. Results We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL (Genetic algorithm based multi-task learning) and e-GA-MTL (an enhanced version of GA-MTL). Experimental results demonstrate that this framework is effective at selecting features for multi-task learning, and that GA-MTL and e-GA-MTL perform better than other heuristic methods. Conclusions Genetic algorithms are a powerful technique to select features for multi-task learning automatically; GA-MTL and e-GA-MTL are shown to to improve generalization performance of classifiers on microarray data sets.

Informations

Publié par
Publié le 01 janvier 2008
Nombre de lectures 9
Langue English

Extrait

BMC Genomics
BioMedCentral
Open Access Research Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection 1 2,32 4 Jack Y Yang, GuoZheng Li*, HaoHua Meng, Mary Qu Yangand 5 Youping Deng
1 2 Address: HarvardMedical School, Harvard University, Cambridge, Massachusetts 021400888 USA,School of Computer Engineering & Science, 3 4 Shanghai University, Shanghai 200072, China,Institute of Systems Biology, Shanghai University, Shanghai 200072, China,National Human Genome Research Institute, National Institutes of Health, U.S. Department of Health and Human Services, Bethesda, MD 20892, USA and 5 Department of Biological Sciences, University of Southern Mississippi, Hattiesburg, MS 39406. USA Email: Jack Y Yang  jyang@bwh.harvard.edu; GuoZheng Li*  gzli@shu.edu.cn; HaoHua Meng  mhhtj@shu.edu.cn; Mary Qu Yang  yangma@mail.nih.gov; Youping Deng  youping.deng@usm.edu * Corresponding author
fromThe 2007 International Conference on Bioinformatics & Computational Biology (BIOCOMP'07) Las Vegas, NV, USA. 25-28 June 2007
Published: 20 March 2008 BMC Genomics2008,9(Suppl 1):S3
doi:10.1186/1471-2164-9-S1-S3
<supplement><title><p>The2007InternationalConferenceonBioinformatics&amp;ComputationalBiology(BIOCOMP'07)</p></title><editor>JackYJang,MaryQuYang,Mengxia(Michelle)Zhu,YoupingDengandHamidRArabnia</editor><note>Research</note></supplement> This article is available from: http://www.biomedcentral.com/1471-2164/9/S1/S3 © 2008 Yang et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract Background:Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue. Results:We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL (Genetic algorithm based multi-task learning) and e-GA-MTL (an enhanced version of GA-MTL). Experimental results demonstrate that this framework is effective at selecting features for multi-task learning, and that GA-MTL and e-GA-MTL perform better than other heuristic methods. Conclusions:Genetic algorithms are a powerful technique to select features for multi-task learning automatically; GA-MTL and e-GA-MTL are shown to to improve generalization performance of classifiers on microarray data sets.
Background Tumor classification is performed on microarray data col lected by DNA microarray experiments from tissue and cell samples [13]. The wealth of such data for different stages of the cell cycle aids in the exploration of gene inter
actions and in the discovery of gene functions. Moreover, genomewide expression data from tumor tissues gives insight into the variation of gene expression across tumor types, thus providing clues for tumor classification of individual samples. The output of a microarray experi
Page 1 of 12 (page number not for citation purposes)
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents