Universal DNA Tag Systems: A Combinatorial Design Scheme
13 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Universal DNA Tag Systems: A Combinatorial Design Scheme

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
13 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Niveau: Supérieur, Doctorat, Bac+8
Universal DNA Tag Systems: A Combinatorial Design Scheme Amir Ben-Dor? Richard Karp† Benno Schwikowski‡ Zohar Yakhini Abstract Custom-designed DNA arrays o?er the possibil- ity of simultaneously monitoring thousands of hy- bridization reactions. These arrays show great potential for many medical and scientific applica- tions such as polymorphism analysis and genotyp- ing. Relatively high costs are associated with the need to specifically design and synthesize problem- specific arrays. Recently, an alternative approach was suggested that utilizes fixed, universal ar- rays. This approach presents an interesting de- sign problem—the arrays should contain as many probes as possible, while minimizing experimen- tal errors caused by cross-hybridization. We use a simple thermodynamic model to cast this de- sign problem in a formal mathematical framework. Employing new combinatorial ideas, we derive an e?cient construction for the design problem, and prove that our construction is near-optimal. 1 Introduction Oligonucleotides are short single-stranded pieces of DNA (typically 15-50 nucleotides) made by chem- ?Department of Computer Science & Engineering, Univer- sity of Washington. Supported by the Program in Mathematics and Molecular Biology (). †International Computer Science Institute and Mathemat- ical Sciences Research Institute, University of California at Berkeley. Supported by NSF grant at the University of Wash- ington (karp@icsi.

  • tat system

  • such greedy

  • design problem

  • solution-phase hybridization

  • specific part ligated

  • dna

  • facilitate sort- ing

  • sort into

  • pairs such


Sujets

Informations

Publié par
Nombre de lectures 22
Langue English

Extrait

Presented at RECOMB 2000, extended version appeared in Journal of Computational Biology
Universal DNA Tag Systems: A Combinatorial Design Scheme
∗ † ‡ §Amir Ben-Dor Richard Karp Benno Schwikowski Zohar Yakhini
Abstract ical synthesis. In solution, oligonucleotides tend to
specificallyhybridize with their Watson-Crick com-
Custom-designed DNA arrays offer the possibil- plements ([21]), and form a stable DNA duplex.
ity of simultaneously monitoring thousands of hy- This specificity is exploited in molecular hybridiza-
bridization reactions. These arrays show great tion assays, in which oligonucleotides are used as
potentialfor many medicaland scientific appilca- probes to identify any complementary (or near-
tions such as polymorphism analysis and genotyp- complementary) DNA from a complex mixture of
ing. Relatively high costs are associated with the target DNA.
need to specifically design and synthesize problem- Array-based hybridization assays, introduced in
specific arrays. Recently, an alternative approach the late 1980s [6, 13, 15, 3, 5, 8], offer the pos-
was suggested that utilizes fixed, universal ar- sibility of simultaneously monitoring a multitude
rays. This approach presents an interesting de- with(currently up to tens of thousands) of hy-
sign problem—the arrays should contain as many bridization reactions. In such an assay, a target-
probes as possible, while minimizing experimen- specific set of oligonucleotides is synthesized on a
talerrors caused by cross-hybridization. We use solid support surface (e.g., silicon or glass). A fluo-
a simple thermodynamic model to cast this de- rescently labeled target sample mixture of DNA or
sign problem in a formal mathematical framework. RNA fragments is then brought in contact with the
Employing new combinatorial ideas, we derive an treated surface, and allowed to hybridize with the
efficient construction for the design problem, and synthesized oligonucleotides. Scanning the fluores-
prove that our construction is near-optimal. cent labels of the fragments attached to the array
reveals information about the content of the sam-
ple mixture. Theoretically, the assay conditions1 Introduction
are such that hybridization only occurs in sites on
Oligonucleotides are short single-stranded pieces of the surface that are Watson-Crick complements to
DNA (typically 15-50 nucleotides) made by chem- some substring in the target. In practice, cross-
∗ hybridizationis a main source of cross-signalcon-Department of Computer Science & Engineering, Univer-
sity of Washington. Supported by the Program in Mathematics tamination in any array-based hybridization assay.
and Molecular Biology (amirbd@cs.washington.edu). Array-based hybridization assays show great†InternationalComputer Science Institute and Mathemat-
icalSciences Research Institute, University of Caifol rnia at potentialfor many different appiclations such as
Berkeley. Supported by NSF grant at the University of Wash- SNP genotyping [12], gene expression profiling
ington (karp@icsi.berkeley.edu).
‡ [4], and resequencing DNA [14, 12]. Recently,Department of Computer Science & Engineering, Univer-
sity of Washington. Supported by the German Academic Ex- S. Brenner and others [1, 2] suggested an alterna-
change Service (DAAD) (benno@cs.washington.edu).
§ tive approach based on universal arrays contain-Chemicaland BiologicalSystems Department, Agilent Lab-
oratories, a Hewlett-Packard Subsidiary (zohary@hpl.hp.com). ing oligonucleotides called antitags.TheWatson-
Crick complement of each antitag is called a tag.
The tag–antitag pairs are designed so that each
tag hybridizes strongly to its complementary an-
titag, but not to any other antitag. In this ap-
proach, the analysis of a DNA sample consists oftwo steps: solution-phase hybridization followed by 2. When an individualis to be genotyped, a
solid-phase hybridization. In the first step, hy- sample is prepared that contains the se-
bridization takes place between the target DNA quences flanking each of the SNP loci.
in solution and a set of oligonucleotide precursors The sample is mixed with the reporter
called reporter molecules. Each reporter molecule molecules. Solution-phase hybridization then
consists of a target-specific part ligated to a unique takes place. Assuming that specificity is per-
tag. Reporter/target hybridization events are reg- fect, this results in the flanking sequences of
istered (e.g by an enzymatic reaction). In the sec- the SNPs paired only with the appropriate
ond step the modified precursors are introduced reporter molecule.
to the array. Tags form duplexes with the corre-
3. Single nucleotides, A,C,T,G, fluorescently la-sponding antitags. Thus, the reporter molecules
beled with four distinct colors, are added toare sorted into different locations on the array and
the mixture. These labeled nucleotides hy-hybridization events can be determined. This ap-
bridize to the polymorphic site of each SNPproach has severaladvantages:
and are ligated to the corresponding reporter
• Complicated array manufacturing processes molecule. That is, each reporter molecule is
are required only for the fixed, universal com- extended by exactly one labeled nucleotide.
ponent of the assay. These universalcompo-
4. The extended reporter molecules are sepa-nents can therefore be mass-produced, signif-
rated from the sample fragments, and broughticantly reducing manufacturing costs.
intocontact with theuniversalarray. Assum-
• The assay components that need to be de- ing that specificity is perfect, the tag part of
signed for a specific target are involved in so- each reporter molecule will only hybridize to
lution phase processes. The underlying nu- its complementary antitag on the array. Thus
cleic acid chemistry and thermodynamics are the extended reporter molecules sort into the
better understood than the same aspects of array sites where the corresponding antitag is
surface-based processes. Therefore a more ef- present.
ficient and effective design process is facili-
5. For each site of the array, the fluorescent col-tated.
ors present at that site are detected. The col-
As an example, we describe a multiplexed SNP ors indicate which bases were used for the ex-
genotyping assay. SNPs (single nucleotide poly- tension at the corresponding SNP site, and
morphisms) are differences, across the population, thus revealthe SNPvariationspresent inthe
in a single base, within an otherwise conserved ge- individual.
nomic sequence [9]. Genotyping is a process that
The design problem for a DNA TAT systemdetermines the variants present in a given sample,
presents a tradeoff. Clearly, it is desirable to haveover a set of SNPs. This assay uses off-the-shelf
as many tags as possible, in order to maximize theuniversalcomponents: a universalset of oilgonu-
number of SNPs that can be genotyped in parallel.cleotide tags and a universal array of antitags. The
On the other hand, if too many tags are used, sim-antitags, immobilized on the array, are Watson-
ilar tags will necessarily entail cross-hybridizationCrick complements of the tags in the mixture. The
events (where tags hybridize to foreign antitags),whole system will be called a DNA Tag/AntiTag
reducing the accuracy of the assay.system and in short a DNA TAT system. Con-
This design problem was identified in previoussider a set of SNPs to be genotyped. The assay is
work and several formulations and solutions wereperformed as follows (see Fig. 1):
proposed [10, 1, 2, 18, 11]. These papers differ
1. A set of reporter molecules (one for each both in the way hybridization is modeled, and in
SNP) is synthesized in solution. Each re- the algorithmic approach employed to find a good
porter molecule consists of two parts that are DNA TAT system. In [10] a TAT system is de-
ligated (in string language: concatenated) to- scribed as a part of a strategy for surface-based
gether. The first part is the Watson-Crick DNA computing. The authors take a coding theory
complement of the upstream sequence that approach and choose to modelcross-hybridization
immediately precedes the polymorphic site of constraints as generalHamming distance condi-
the SNP. The second part of each reporter tions. A set of 108 8-mers, with a 50%G/C content,
molecule is a unique tag from the universal which differ in at least 4 bases from each other, is
set of tags.Fragments spanning the
polymorphism sites for all
the SNPs in the set are
extracted. The different
shapes denote different
variants.
Oligonucleotides complementary to the sequences
immediately preceding the polymorphism sites are
tagged by DNA tags, designed to specifically
hybridize to their complements on the array.
Extension reactions take place in solution phase,
in the presence of a mixture of all four dydeoxy
nucleotides (differentially fluorescently labeled)
and an appropriate enzyme. For each SNP the
extending base is the one complementary to the
one corresponding to the base present in the
sample sequence. After separation (the whole
process can be performed in high temperature) a
mixture of reporter molecules is formed. This
mixt

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents