La lecture à portée de main
Description
Sujets
Informations
Publié par | universitat_bielefeld |
Publié le | 01 janvier 1999 |
Nombre de lectures | 17 |
Extrait
FromSubtreestoSupertrees
SebastianBo¨cker
Universität Bielefeld
SGtrrauktduuierrbildutenkngollegs-
prozesse
Dissertation
zurErlangungdesDoktorgrades
derFakulta¨tfu¨rMathematik
Universita¨tBielefeld
Bielefeld,imAugust1999
ii
.1
.2
.3
Gutachter:
Gutachter:
Gutachter:
.forP
.forP
forP.
Andreas
.M.W
sserD
MichaelA.Steel(Christchurch,
ThomasZink
Datumdermu¨ndlichenPru¨fung:28.Januar2000
Neuseeland)
Gedrucktaufalterungsbesta¨ndigemPapier
1ISO9706.
Acknowledgments
ThisthesiswaswrittenunderthesupervisionofProf.AndreasW.M.Dress.
Partsofthisthesishavebeenpublishedinadvance[BD00b,BDS99],andfurther
papersareforthcoming[BBDS,BD,Bo¨ca,Bo¨cb].
Theworkwascarriedoutwhilebeingsupportedbythe“GraduiertenkollegStruk-
turbildungsprozesse”thatispartoftheCenterforInterdisciplinaryResearchonStruc-
tureFormation(FSPM)attheUniversityofBielefeld,Germany.
Firstandforemost,IwishtothankmysupervisorProfessorAndreasDress.
Withouthisideas,comments,andcontributionsthisthesiswouldnotexist.Ialso
thankhimforbringingmeintocontactwiththetopic,forfruitfuldiscussions,and
formaintainingmyenthusiasm.Hispromotionmadeitpossibleformetogoto
Christchurch,NewZealand,forthreemonths.
SpecialthanksgotoDr.MikeSteelwhoaskedtherightquestionattheright
timeandtherebygavemethechancetowritethisthesis.Heisnotonlymy“unocial
supervisor”butalsomyhikingtrainer,andhecoulddoubtlesslymakeathirdcareer
asacartographer.ThankyouverymuchforthehospitalityinChristchurch,andfor
themaps!
Thanks(withoutanyspecialorder)goto:MygirlfriendClaudiaHu¨ttnerfor
listeningandnotcomplaining;AnkeSchnitkerforthechats;myroommatesat
university,ChristianAlscherandSwenOsterkamp,forkeepingmesanemost
ofthetime;everybodyelseintheGraduiertenkolleg;myprimaryschoolteacher
FriedelKaraschewskifortellingmetostudymathematics;KordulaDo¨ringfor
theneedles;ThomasMu¨ller,NikolausWiesner,andthecityofHamburgfor
brainrelaxation;CakeforthreegreatCDs;ProfessorPeterSlodowyforshowing
methefunandbeautyofmathematics;theChristchurchiansforawonderful
time;BernhardBalkenhol,HolgerPaschke,andKarl-HermannWieners
forcomputersupport;MarcKupietzforcigarettes,movies,andmusic;myParents
foreverything.
Inaccordancewithmathematicsprotocol,Iwillendeavortousethepersonal
pronoun“we”toindicatethereaderandthewriter,ratherthanthe“royalwe.”
iii
i
v
ACKNOWLEDGM
STNE
Contents
Chapter1.Introduction
1.1.Theproblemoftreeamalgamation
1.2.Motivationviaapplications
1.3.Themainresult
1.4.Structureofthisthesis
1.5.Furthersteps
1.6.Basicnotations
Chapter2.Graphs,Trees,andPatches
2.1.Graphsandpaths
2.2.Trees
2.3.Patches
2.4.Twoequivalencerelationsonthesetofvertices
Chapter3.X-TreesandSplitSystems
3.1.Denitions
3.2.Basicobservations
3.3.InducedY-treesandsplitsystems
3.4.Inducingsubtreesviapatchesandsubsetsofedges
3.5.ConnectingpartialsplitsystemsandX-trees
Chapter4.TheTreeJoiningTheorem
4.1.Proofoftheexistence
4.2.Proofoftheuniqueness
Chapter5.ClusterSystemsandPatchworks
5.1.Denitions,notations,andbasicobservations
5.2.Mainresults
5.3.Examples
5.4.ProofofTheorem4
5.5.ProofofTheorems5and2incase#C<1
5.6.Applicationsofpatchworks
5.7.Droppingthenitenessrestriction
Chapter6.QuartetEncodingsandExcess
v
11223457781141717181023282131323536304144454640575
vi
viCONTENTS
6.1.Denitionsandbasicobservations
6.2.Examplesofquartetencodings
6.3.Tightquartetencodings
6.4.Inducedencodingsandq-patches
Chapter7.TheMainTheorem
7.1.Condition(ii)impliescondition(i)
7.2.Condition(i)impliescondition(ii)
7.3.Firstclaim
7.4.Secondclaim
7.5.Thirdclaim
Chapter8.ImplicationsandApplications
8.1.Reconstructingtreesfromquartetsplits
8.2.Reconstructingtreesfromsubtrees
8.3.Fastamalgamationofanexcess-freecollectionoftrees
8.4.Thedyadicclosure
Bibliography
ListofSymbols
xednI
758506265656669617277777385888195999
21.INTRODUCTION
Infact,eventheproblemofdeterminingwhetherthereexistsatleastonesupertreeof
TwasproventobeNP-completebySteel[Ste92];anditremainsNP-completeifwe
assume—withoutlossofgenerality,aswewillseeinSection3.5—thatalltreesinT
havepreciselyfourleaves.Thecomplexityofdeterminingwhethersome(known!)tree
istheuniquesupertreeofagivencollectionisstillunknown,butstronglyconjectured
tobeNP-completeaswell.AndifthetreesinTcanbeamalgamated,theremaybe
(exponentially)manysupertrees;seeExample19onpage81fora“non-trivial”case
whereallofthese(exponentiallymany)supertreesarebinary.
1.2.Motivationviaapplications
Invariouseldsofclassicationsuchasbiology[CSW87,SPH98,SOWH96],lin-
guistics[War97],orstemmatology[BBHR98,Rob96],treesareusedtorepresentevolu-
tionary,historical,orhierarchicalrelationships.Inbiology,suchtrees(“phylogenies”)
typicallyrepresenttheevolutionaryhistoryofacollectionofextantspecies,orthe
lineofdescentofsomegene.Inmostapplications,theobjectsofinterestarefound
atthetips(leaves)ofthetree,whileallinteriorverticescorrespondtoabranching
(orspeciation)event.
Thereareseveralreasonswhyamalgamationproblemsasdescribedabovearise
naturallyintheseelds.Firstly,wemaywishtocombinetreesthathavebeen
reconstructedusingdistinct,thoughoverlappingcollectionsofobjects(usuallyby
dierentresearchersusingdierentdataand,asoftenasnot,dierentreconstruc-
tionmethods).Asecondreasonisthat,ingeneral,itisdiculttoaccuratelyre-
constructlargetreesdirectly,andwemaychooseinsteadtoreconstruct“trustwor-
thy”treesforsmallsubsets,andthencombinetheseinasupertree(orsupertrees),
see[BD86,HNP+98,HVW99,HW99,SvH96,Wil98].Anadditionalreasonforthisap-
proachisthat,forgeneticdata,thenumberofsitesthatcanbeaccuratelyaligned
acrossasmallnumberofcloselyrelatedsequencesisoftenmuchlargerthanthecorre-
spondingnumberofsitesforasetthatislargeandincludesratherdiversesequences.
Also,theaccuracyofpolynomialtimetreereconstructionmethods(likeneighborjoin-
ingorheuristicsformaximumparsimony)appearstodeterioratewiththemaximum
evolutionarydistanceinaphylogeny[RW97,HNR+98].
1.3.Themainresult
Althoughtheproblemsof(a)ndingasupertreeand(b)testingwhetheritis
uniquelydeterminedareknowntobehard,afullanswercanbegiventothesec-
ondquestion(implyingaswellanecientwaytoeventuallyconstructthisunique
supertree)inthefollowing,particularlyinterestingcase:Givenacollectionofbinary
(ornon-degenerated)treesT={T1,T2,...,Tk},letXjdenotethesetofleaves(or
pendingvertices)ofthetreeTjforj=1,...,k,letXdenotethesetofleavesofa
binarytreeT,andassumethatXcoincideswiththeunionofthe(notnecessarily
disjoint)setsX1,...,Xk.IfTistheuniquesupertreeofthecollectionT,then
k#X3(#Xj3)
X1=j
1.4.STRUCTUREOFTHISTHESIS3
mustalwayshold,suggestingthatparticularattentionshouldbedevotedtothemini-
mum1case—thatis,thecasewhereequalityholds.Tothisend,wedenetheexcess
ofthecollectionTby
exc(T):=#Xj(#Xj3)3,
[kXk
j=1j=1
andwede