DNA motif alignment by evolving a population of Markov chains

biomed - Bi

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

12 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Deciphering cis -regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. Results This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Conclusion Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

Informations

Publié par	biomed
Publié le	01 janvier 2009
Nombre de lectures	8
Langue	English

Extrait

Pga e 1fo1 (2apegum nr bet nor foaticnoitrup esops)

Abstract Background: Deciphering cis -regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been e xpended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through se quence local alignment. Noneth eless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite fo r finding good local alignments, these motif algorithms are often independently run a multitude of times, but wi thout information exchange between different chains. Hence it would be worth a new algorith m design enabling such information exchange. Results: This paper presents a novel motif-finding al gorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated thro ugh a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distri bution rather than indivi dual ones, and adaptively evolves the population as a whole towards a glob al maximum. The alignment information exchange is accomplished by taking advant age of the pooled motif site dist ributions. A distinct method for running multiple independent Markov chains (I MC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Conclusion: Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of mo tif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

from The Seventh Asia Pacific Bioinformatics Conference (APBC 2009) Beijing, China. 13–16 January 2009 Published: 30 January 2009 BMC Bioinformatics 2009, 10 (Suppl 1):S13 doi:10.1186/1471-2105-10-S1-S13 <supplement> <title> <p>Selected papers from the Seventh Asia-Pacific Bioinformatics Conference (APBC 2009)</p> </title> <edito r>Michael Q Zhang, Michael S Waterman and Xuegong Zhang</editor> <note>Research</note> </supplement> This article is available from: http:/ /www.biomedcentral.com/1471-2105/10/S1/S13 © 2009 Bi; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons. org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the orig inal work is properly cited.

BMC Bioinformatics

Address: 1 Bioinformatics and Intelligent Computing Lab, Division of Clinic al Pharmacology, Children's Merc y Hospitals, Kansas City, Missouri, USA and 2 Schools of Medicine, and Computing and Engineering, University of Missouri, Kansas City, Missouri, USA Email: Chengpeng Bi - cbi@cmh.edu

Research Open Access DNA motif alignment by evolving a population of Markov chains Chengpeng Bi 1,2

Bio Med Central

Background biological and pathological processes [1]. Although much Discovering cis -regulatory elements or DNA motifs in algorithmic effort has been expended, it still remains chal-genomic sequences is fundamental to build genetic net- lenging (see recent reviews in [2,3]). Multiple sequence works and important to understand gene regulation in local alignment coupling with position weight matrix