False discovery rate and asymptotics [Elektronische Ressource] / vorgelegt von Thorsten-Ingo Dickhaus
144 pages
Deutsch

False discovery rate and asymptotics [Elektronische Ressource] / vorgelegt von Thorsten-Ingo Dickhaus

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
144 pages
Deutsch
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

FalseDiscoveryRateandAsymptoticsInaugural-DissertationzurErlangungdesDoktorgradesderMathematisch NaturwissenschaftlichenFakultätderHeinrich Heine UniversitätDüsseldorfvorgelegtvonThorsten IngoDickhausausBerlin KreuzbergJanuar2008AusdemInstitutfürBiometrieundEpidemiologiedesDeutschenDiabetes Zentrums,Leibniz InstitutanderHeinrich Heine UniversitätDüsseldorfGedrucktmitderGenehmigungderMathematisch NaturwissenschaftlichenFakultätderHeinrich Heine UniversitätDüsseldorfReferent: Prof. Dr. ArnoldJanssenKorreferent: PDDr. HelmutFinnerTagdermündlichenPrüfung: 15. Januar2008ContentsOverview 11 Introduction 31.1 MultipletestingandFalseDiscoveryRate . . . . . . . . . . . . . . . . . . . . . 31.2 Theconceptofp values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.1 p valueadjustmentformultiplicity . . . . . . . . . . . . . . . . . . . . . 92 FDRcontrolwithSimes’criticalvalues 102.1 Generaltheoreticalframeworkintheexchangeablesetup . . . . . . . . . . . . . 152.1.1 Twomodelswithexchangeableteststatistics . . . . . . . . . . . . . . . 162.1.2 LargestcrossingpointsandcomputationofEERandFDR . . . . . . . . 182.1.3 AllLCPsgreaterthanzero . . . . . . . . . . . . . . . . . . . . . . . . . 192.1.4 SomeLCPsequaltozero . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Exchangeableexponentiallydistributedvariables . . . . . . . . . . . . . . . . . 242.3normallydistributedvariables . . . . . . . . . . . . . . . . . . . . 282.3.

Sujets

Informations

Publié par
Publié le 01 janvier 2008
Nombre de lectures 15
Langue Deutsch
Poids de l'ouvrage 4 Mo

Extrait

FalseDiscoveryRateandAsymptotics
Inaugural-Dissertation
zur
ErlangungdesDoktorgradesder
Mathematisch NaturwissenschaftlichenFakultät
derHeinrich Heine UniversitätDüsseldorf
vorgelegtvon
Thorsten IngoDickhaus
ausBerlin Kreuzberg
Januar2008AusdemInstitutfürBiometrieundEpidemiologiedes
DeutschenDiabetes Zentrums,Leibniz Institutander
Heinrich Heine UniversitätDüsseldorf
GedrucktmitderGenehmigungder
Mathematisch NaturwissenschaftlichenFakultätder
Heinrich Heine UniversitätDüsseldorf
Referent: Prof. Dr. ArnoldJanssen
Korreferent: PDDr. HelmutFinner
TagdermündlichenPrüfung: 15. Januar2008Contents
Overview 1
1 Introduction 3
1.1 MultipletestingandFalseDiscoveryRate . . . . . . . . . . . . . . . . . . . . . 3
1.2 Theconceptofp values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 p valueadjustmentformultiplicity . . . . . . . . . . . . . . . . . . . . . 9
2 FDRcontrolwithSimes’criticalvalues 10
2.1 Generaltheoreticalframeworkintheexchangeablesetup . . . . . . . . . . . . . 15
2.1.1 Twomodelswithexchangeableteststatistics . . . . . . . . . . . . . . . 16
2.1.2 LargestcrossingpointsandcomputationofEERandFDR . . . . . . . . 18
2.1.3 AllLCPsgreaterthanzero . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.4 SomeLCPsequaltozero . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Exchangeableexponentiallydistributedvariables . . . . . . . . . . . . . . . . . 24
2.3normallydistributedvariables . . . . . . . . . . . . . . . . . . . . 28
2.3.1 Thespecialcaseζ = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.2 Thegeneralcaseζ < 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Exchangeablestudentizednormalvariables . . . . . . . . . . . . . . . . . . . . 41
2.4.1 Thespecialcaseν = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4.2 Thegeneralcaseν > 1andζ < 1 . . . . . . . . . . . . . . . . . . . . . 45
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3 Anewrejectioncurve 52
3.1 Notationandpreliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Motivationandheuristicderivation . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 Proceduresbasedonthenewrejectioncurve . . . . . . . . . . . . . . . . . . . . 55
3.4 LFCresultsandupperFDRbounds . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5 AsymptoticFDRcontrolforproceduresbasedontheAORC . . . . . . . . . . . 64
3.6optimalityoftheAORC . . . . . . . . . . . . . . . . . . . . . . . . 70
3.7 FDRcontrolforafixednumberofhypotheses . . . . . . . . . . . . . . . . . . . 74
i3.7.1 Simultaneousβ adjustment . . . . . . . . . . . . . . . . . . . . . . . . 77
3.7.2 Multivariateoptimizationproblem . . . . . . . . . . . . . . . . . . . . . 78
3.8 ConnectiontoStorey’sapproach . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4 PowerstudyforsomeFDR controllingtestprocedures 85
4.1 Simplehypothesescase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.2 Compositehypothesescase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5 Concludingremarksandoutlook 93
A Numericalsimulationsandcalculations 95
A.1 SimulationsforFDRunderdependency . . . . . . . . . . . . . . . . . . . . . . 95
A.2 AdjustedproceduresbasedontheAORC . . . . . . . . . . . . . . . . . . . . . 97
A.2.1 SUD procedure,Example3.5 . . . . . . . . . . . . . . . . . . . . . . . 97
(1)
A.2.2 SU procedurebasedon f ,Example3.7 . . . . . . . . . . . . . . . 980.05,κ1
(2)
A.2.3basedon f ,3.7 . . . . . . . . . . . . . . . 99
0.05,κ2
A.2.4 SU procedurewithtruncatedcurve,Example3.8 . . . . . . . . . . . . . 99
B Conceptsofpositivedependency 101
ListofTables 107
ListofFigures 108
Bibliography 109
iiListofAbbreviationsandSymbols
AORC AsymptoticallyOptimalRejectionCurve
B(p,q) Betafunction,B(p,q) = Γ(p)Γ(q)/Γ(p+q)
BP BoundaryPoint
dxe Smallestintegerlargerthanorequaltox
2χ Chi squaredistributionwith ν degreesoffreedomν
{M ComplementofthesetM
CP CrossingPoint
cdf. Cumulativedistributionfunction
δ Kroneckersymboli,j
ecdf. Empiricalcumulativedistributionfunction
ε Diracmeasureinpointaa
d= Equalityindistribution
EER ExpectedErrorRate
F Cumulativedistributionfunctionofareal valuedX
randomvariableX
FDR FalseDiscoveryRate
FWER FamilyWiseErrorRate
bxc Largestintegerlowerthanorequaltox
R∞ x−1 −tΓ(·) Gammafunction,Γ(x) = t e dt, x> 0
0
im(X) ImageoftherandomentityX
iiii.i.d. independentandidenticallydistributed
1 IndicatorfunctionofsetMM
LCP LargestCrossingPoint
L(X) LawofdistributionofrandomvariableX
LFC LeastFavorableConfiguration
λ Lebesguemeasure
MTP Multivariatetotalpositivityoforder22
N {1,...,n}n
2 2N(μ,σ ) Normaldistributionwithparametersμandσ
Φ CumulativedistributionfunctionoftheN(0,1)distribution
ϕ(·) ProbabilitydensityfunctionoftheN(0,1)distribution
PRD Positiveregressiondependency
PRDS Positiveregressiondependencyonsubsets
pdf. Probabilitydensityfunction
pmf. Probabilitymassfunction
SD Step down
SU Step up
SUD Step up down
UNI[a,b] Uniformdistributionontheinterval[a,b]
ivOverview
TheFalseDiscoveryRate(FDR)isaratheryoungparadigmincontrollingerrorsofamultipletest
procedure. Especially in the context of genetics and microarray analyses, the FDR has become
a very popular error control criterion over the last decade, because it is less restrictive than the
classicalFamilyWiseErrorRate(FWER).Thisisespeciallyimportantsinceinseveraloftoday’s
application fieldslike genome wideassociation (GWA)studies, sometimesten thousandsor even
somehundredthousandsofhypotheseshavetobetestedsimultaneouslyandtheanalyses(atleast
at a first stage) have mainly explorative character so that in this stage of the analysis one is of
ten more interested in getting some significances than in avoiding a few false ones. Instead of
controlling the probability of making at least one false rejection, the FDR controls the expected
proportion offalselyrejected(true)nullhypothesesamongallrejections. Duetothemassivemul
tiplicity of some of the current applications, asymptotic considerations become more and more
relevant. Therefore, in this work special focus will be laid on the asymptotic behaviour of the
False Discovery Rate with the numbern of hypotheses tending to infinity. Other applications in
cludeastronomy(cf.,e. g.,[176])andproteomics,cf. Application2.4.
The remainder of this work is organized as follows. In Chapter 1, some theoretical foundations
will be presented, including a formal definition of the FDR. Most of the results in that chapter
are already known so that it has a repetitious character. Furthermore, some notational aspects are
covered.
Chapter 2 then deals with a popular FDR controlling multiple test procedure, namely the linear
step up procedure based on Simes’ critical values introduced in the pioneering article by Ben
jamini and Hochberg from 1995, see [13]. Since it is well known that this method controls the
FDR for positively dependent test statistics being at hand, we study its asymptotic conservative
nessinsomespecialdistributionalsituations.
In Chapter 3 we present and investigate a new rejection curve designed to asymptotically exhaust
thewholeFDRlevelαundersomeextremeparameterconfigurations.
1Besides these theoretical considerations, we will apply some of the test procedures presented in
Chapters2and3toreallifedataandinvestigateFDR"atwork".
Chapter 4 contains a systematic (numerical) comparison of some recently developed test proce
dures which aim at improving the linear step up procedure. Under various distributional settings,
we investigate their behaviour with respect to type I error and power. This allows us to discuss
assetsanddrawbacksofeachoftheconsideredprocedures.
In Chapter 5, finally, our results will be summarized and we give an outlook on some pursuing
issues.
Some numerical computations and computer simulations referring to the theoretical results in
Chapters 2 and 3 are presented in the Appendix. Moreover, we briefly discuss some notions of
positivedependencythere.
The research that has lead to this work has been part of the first period of a research project
sponsored by the Deutsche Forschungsgemeinschaft (DFG), grant No. FI 524/3 1, under the re
sponsibilityofmyadvisorHelmutFinnerandofProf.GuidoGiani. Intheapplicationtothisgrant,
the aims of Chapters 2 and 3 have already been formulated and parts of the elaborations in these
chapters are joint work with Helmut Finner and Markus Roters as well. Main results of Chapter
2arepre publishedin[86]and[88]. AnarticlecontainingthemainresultsofChapter3hasbeen
acceptedforpublication,see[87]. IamgratefultotheDFGforfinancingmytenureattheGerman
Diabetes Center from July 2005 to April 2007 and to Helmut Finner for providing me with the
interestingtopicsandforsomevaluablepreliminarynotesfromhistreasurechest.
2Chapter1
Introduction
1.1 MultipletestingandFalseDiscoveryRate
The goal of multiple testing consists of testingn > 1 hypotheses simultaneously and controlling
some kind of overall error rate. The most conservative and highly intuitive method is
theFamilyWiseErr

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents