From subtrees to supertrees [Elektronische Ressource] / Sebastian Böcker

Publié par

From Subtrees to SupertreesSebastian B¨ockerUniversität BielefeldGraduiertenkollegStrukturbildungs−prozesseDissertationzur Erlangung des Doktorgradesder Fakult¨at fu¨r MathematikUniversit¨at BielefeldBielefeld, im August 1999ii1. Gutachter: Prof. Andreas W.M. Dress2. Gutachter: Prof. Michael A. Steel (Christchurch, Neuseeland)3. Gutachter: Prof. Thomas ZinkDatum der mu¨ndlichen Pru¨fung: 28. Januar 2000Gedruckt auf alterungsbesta¨ndigem Papier 1 ISO 9706.AcknowledgmentsThis thesis was written under the supervision of Prof. Andreas W.M. Dress.Parts of this thesis have been published in advance [BD00b,BDS99], and furtherpapers are forthcoming [BBDS,BD,B¨oca,B¨ocb].Theworkwascarried outwhilebeingsupportedbythe“Graduiertenkolleg Struk-turbildungsprozesse”thatispartoftheCenterforInterdisciplinaryResearchonStruc-ture Formation (FSPM) at the University of Bielefeld, Germany.First and foremost, I wish to thank my supervisor Professor Andreas Dress.Without his ideas, comments, and contributions this thesis would not exist. I alsothank him for bringing me into contact with the topic, for fruitful discussions, andfor maintaining my enthusiasm. His promotion made it possible for me to go toChristchurch, New Zealand, for three months.Special thanks go to Dr. Mike Steel who asked the right question at the righttimeandtherebygavemethechancetowritethisthesis.
Publié le : vendredi 1 janvier 1999
Lecture(s) : 16
Tags :
Source : D-NB.INFO/969280246/34
Nombre de pages : 106
Voir plus Voir moins

FromSubtreestoSupertrees

SebastianBo¨cker

Universität Bielefeld
SGtrrauktduuierrbildutenkngollegs-
prozesse

Dissertation
zurErlangungdesDoktorgrades
derFakulta¨tfu¨rMathematik
Universita¨tBielefeld

Bielefeld,imAugust1999

ii

.1

.2

.3

Gutachter:

Gutachter:

Gutachter:

.forP

.forP

forP.

Andreas

.M.W

sserD

MichaelA.Steel(Christchurch,

ThomasZink

Datumdermu¨ndlichenPru¨fung:28.Januar2000

Neuseeland)

Gedrucktaufalterungsbesta¨ndigemPapier

1ISO9706.

Acknowledgments

ThisthesiswaswrittenunderthesupervisionofProf.AndreasW.M.Dress.
Partsofthisthesishavebeenpublishedinadvance[BD00b,BDS99],andfurther
papersareforthcoming[BBDS,BD,Bo¨ca,Bo¨cb].
Theworkwascarriedoutwhilebeingsupportedbythe“GraduiertenkollegStruk-
turbildungsprozesse”thatispartoftheCenterforInterdisciplinaryResearchonStruc-
tureFormation(FSPM)attheUniversityofBielefeld,Germany.
Firstandforemost,IwishtothankmysupervisorProfessorAndreasDress.
Withouthisideas,comments,andcontributionsthisthesiswouldnotexist.Ialso
thankhimforbringingmeintocontactwiththetopic,forfruitfuldiscussions,and
formaintainingmyenthusiasm.Hispromotionmadeitpossibleformetogoto
Christchurch,NewZealand,forthreemonths.
SpecialthanksgotoDr.MikeSteelwhoaskedtherightquestionattheright
timeandtherebygavemethechancetowritethisthesis.Heisnotonlymy“unocial
supervisor”butalsomyhikingtrainer,andhecoulddoubtlesslymakeathirdcareer
asacartographer.ThankyouverymuchforthehospitalityinChristchurch,andfor
themaps!
Thanks(withoutanyspecialorder)goto:MygirlfriendClaudiaHu¨ttnerfor
listeningandnotcomplaining;AnkeSchnitkerforthechats;myroommatesat
university,ChristianAlscherandSwenOsterkamp,forkeepingmesanemost
ofthetime;everybodyelseintheGraduiertenkolleg;myprimaryschoolteacher
FriedelKaraschewskifortellingmetostudymathematics;KordulaDo¨ringfor
theneedles;ThomasMu¨ller,NikolausWiesner,andthecityofHamburgfor
brainrelaxation;CakeforthreegreatCDs;ProfessorPeterSlodowyforshowing
methefunandbeautyofmathematics;theChristchurchiansforawonderful
time;BernhardBalkenhol,HolgerPaschke,andKarl-HermannWieners
forcomputersupport;MarcKupietzforcigarettes,movies,andmusic;myParents
foreverything.
Inaccordancewithmathematicsprotocol,Iwillendeavortousethepersonal
pronoun“we”toindicatethereaderandthewriter,ratherthanthe“royalwe.”

iii

i

v

ACKNOWLEDGM

STNE

Contents

Chapter1.Introduction
1.1.Theproblemoftreeamalgamation
1.2.Motivationviaapplications
1.3.Themainresult
1.4.Structureofthisthesis
1.5.Furthersteps
1.6.Basicnotations
Chapter2.Graphs,Trees,andPatches
2.1.Graphsandpaths
2.2.Trees
2.3.Patches
2.4.Twoequivalencerelationsonthesetofvertices
Chapter3.X-TreesandSplitSystems
3.1.Denitions
3.2.Basicobservations
3.3.InducedY-treesandsplitsystems
3.4.Inducingsubtreesviapatchesandsubsetsofedges
3.5.ConnectingpartialsplitsystemsandX-trees
Chapter4.TheTreeJoiningTheorem
4.1.Proofoftheexistence
4.2.Proofoftheuniqueness
Chapter5.ClusterSystemsandPatchworks
5.1.Denitions,notations,andbasicobservations
5.2.Mainresults
5.3.Examples
5.4.ProofofTheorem4
5.5.ProofofTheorems5and2incase#C<1
5.6.Applicationsofpatchworks
5.7.Droppingthenitenessrestriction
Chapter6.QuartetEncodingsandExcess
v

11223457781141717181023282131323536304144454640575

vi

viCONTENTS
6.1.Denitionsandbasicobservations
6.2.Examplesofquartetencodings
6.3.Tightquartetencodings
6.4.Inducedencodingsandq-patches
Chapter7.TheMainTheorem
7.1.Condition(ii)impliescondition(i)
7.2.Condition(i)impliescondition(ii)
7.3.Firstclaim
7.4.Secondclaim
7.5.Thirdclaim
Chapter8.ImplicationsandApplications
8.1.Reconstructingtreesfromquartetsplits
8.2.Reconstructingtreesfromsubtrees
8.3.Fastamalgamationofanexcess-freecollectionoftrees
8.4.Thedyadicclosure
Bibliography
ListofSymbols
xednI

758506265656669617277777385888195999

21.INTRODUCTION
Infact,eventheproblemofdeterminingwhetherthereexistsatleastonesupertreeof
TwasproventobeNP-completebySteel[Ste92];anditremainsNP-completeifwe
assume—withoutlossofgenerality,aswewillseeinSection3.5—thatalltreesinT
havepreciselyfourleaves.Thecomplexityofdeterminingwhethersome(known!)tree
istheuniquesupertreeofagivencollectionisstillunknown,butstronglyconjectured
tobeNP-completeaswell.AndifthetreesinTcanbeamalgamated,theremaybe
(exponentially)manysupertrees;seeExample19onpage81fora“non-trivial”case
whereallofthese(exponentiallymany)supertreesarebinary.

1.2.Motivationviaapplications
Invariouseldsofclassicationsuchasbiology[CSW87,SPH98,SOWH96],lin-
guistics[War97],orstemmatology[BBHR98,Rob96],treesareusedtorepresentevolu-
tionary,historical,orhierarchicalrelationships.Inbiology,suchtrees(“phylogenies”)
typicallyrepresenttheevolutionaryhistoryofacollectionofextantspecies,orthe
lineofdescentofsomegene.Inmostapplications,theobjectsofinterestarefound
atthetips(leaves)ofthetree,whileallinteriorverticescorrespondtoabranching
(orspeciation)event.
Thereareseveralreasonswhyamalgamationproblemsasdescribedabovearise
naturallyintheseelds.Firstly,wemaywishtocombinetreesthathavebeen
reconstructedusingdistinct,thoughoverlappingcollectionsofobjects(usuallyby
dierentresearchersusingdierentdataand,asoftenasnot,dierentreconstruc-
tionmethods).Asecondreasonisthat,ingeneral,itisdiculttoaccuratelyre-
constructlargetreesdirectly,andwemaychooseinsteadtoreconstruct“trustwor-
thy”treesforsmallsubsets,andthencombinetheseinasupertree(orsupertrees),
see[BD86,HNP+98,HVW99,HW99,SvH96,Wil98].Anadditionalreasonforthisap-
proachisthat,forgeneticdata,thenumberofsitesthatcanbeaccuratelyaligned
acrossasmallnumberofcloselyrelatedsequencesisoftenmuchlargerthanthecorre-
spondingnumberofsitesforasetthatislargeandincludesratherdiversesequences.
Also,theaccuracyofpolynomialtimetreereconstructionmethods(likeneighborjoin-
ingorheuristicsformaximumparsimony)appearstodeterioratewiththemaximum
evolutionarydistanceinaphylogeny[RW97,HNR+98].

1.3.Themainresult
Althoughtheproblemsof(a)ndingasupertreeand(b)testingwhetheritis
uniquelydeterminedareknowntobehard,afullanswercanbegiventothesec-
ondquestion(implyingaswellanecientwaytoeventuallyconstructthisunique
supertree)inthefollowing,particularlyinterestingcase:Givenacollectionofbinary
(ornon-degenerated)treesT={T1,T2,...,Tk},letXjdenotethesetofleaves(or
pendingvertices)ofthetreeTjforj=1,...,k,letXdenotethesetofleavesofa
binarytreeT,andassumethatXcoincideswiththeunionofthe(notnecessarily
disjoint)setsX1,...,Xk.IfTistheuniquesupertreeofthecollectionT,then
k#X3(#Xj3)
X1=j

1.4.STRUCTUREOFTHISTHESIS3
mustalwayshold,suggestingthatparticularattentionshouldbedevotedtothemini-
mum1case—thatis,thecasewhereequalityholds.Tothisend,wedenetheexcess
ofthecollectionTby
exc(T):=#Xj(#Xj3)3,
[kXk
j=1j=1
andwede

Soyez le premier à déposer un commentaire !

17/1000 caractères maximum.