Submitted to the Annals of Statistics arXiv: math PR
36 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Submitted to the Annals of Statistics arXiv: math PR

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
36 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Niveau: Supérieur, Doctorat, Bac+8
Submitted to the Annals of Statistics arXiv: math.PR/0000000 NEEDLES AND STRAWS IN A HAYSTACK: POSTERIOR CONCENTRATION FOR POSSIBLY SPARSE SEQUENCES By Ismael Castillo ? and Aad van der Vaart We consider full Bayesian inference in the multivariate normal mean model in the situation that the mean vector is sparse. The prior distribution on the vector of means is constructed hierarchically by first choosing a collection of nonzero means and next a prior on the nonzero values. We consider the posterior distribution in the frequen- tist set-up that the observations are generated according to a fixed mean vector, and are interested in the posterior distribution of the number of nonzero components and the contraction of the posterior distribution to the true mean vector. We find various combinations of priors on the number of nonzero coefficients and on these coeffi- cients that give desirable performance. We also find priors that give suboptimal convergence, for instance Gaussian priors on the nonzero coefficients. We illustrate the results by simulations. 1. Introduction. Suppose that we observe a vector X = (X1, . . . , Xn) in Rn such that (1.1) Xi = ?i + ?i, i = 1, . . . , n, for independent standard normal random variables ?i and an unknown vector of means ? = (?1, . . . , ?n). We are interested in Bayesian inference on ?, in the situation that this vector is possibly sparse.

  • square

  • inf ?

  • gaussian sequence

  • such full

  • bayesian inference

  • pin decrease

  • full posterior

  • rate over


Sujets

Informations

Publié par
Nombre de lectures 28
Langue English

Extrait

SubmittedtotheAnnalsofStatistics
arXiv:
math.PR/0000000

NEEDLESANDSTRAWSINAHAYSTACK:POSTERIOR
CONCENTRATIONFORPOSSIBLYSPARSESEQUENCES
∗ByIsmae¨lCastilloandAadvanderVaart
WeconsiderfullBayesianinferenceinthemultivariatenormal
meanmodelinthesituationthatthemeanvectorissparse.Theprior
distributiononthevectorofmeansisconstructedhierarchicallyby
firstchoosingacollectionofnonzeromeansandnextaprioronthe
nonzerovalues.Weconsidertheposteriordistributioninthefrequen-
tistset-upthattheobservationsaregeneratedaccordingtoafixed
meanvector,andareinterestedintheposteriordistributionofthe
numberofnonzerocomponentsandthecontractionoftheposterior
distributiontothetruemeanvector.Wefindvariouscombinations
ofpriorsonthenumberofnonzerocoefficientsandonthesecoeffi-
cientsthatgivedesirableperformance.Wealsofindpriorsthatgive
suboptimalconvergence,forinstanceGaussianpriorsonthenonzero
coefficients.Weillustratetheresultsbysimulations.

1.Introduction.
Supposethatweobserveavector
X
=(
X
1
,...,X
n
)
in
R
n
suchthat
(1.1)
X
i
=
θ
i
+
ε
i
,i
=1
,...,n,
forindependentstandardnormalrandomvariables
ε
i
andanunknownvector
ofmeans
θ
=(
θ
1
,...,θ
n
).WeareinterestedinBayesianinferenceon
θ
,in
thesituationthatthisvectorispossibly
sparse
.
Non-Bayesianapproachestothisproblemhaverecentlybeenconsidered
bymanyauthors.Golubev[13]obtainedresultsformodelselectionmethods
andthresholdestimatorsforthemean-squaredrisk.Birge´andMassart[4]
treatedthemodelwithintheirgeneralcontextofmodelselectionbypenal-
izedleastsquares.Abramovichetal.in[1]studiedtheperformanceofthe
FalseDiscoveryRatemethod.TheearlierworkbyDonohoandJohnstone
[10]canbeviewedasstudyingtheproblemwithinan
`
r
context.Manyau-
thors(seee.g.[3],[22],[21]andreferencescitedthere)haveinvestigatedthe
connectiontotheLASSOorsimilarmethods.
MethodswithaBayesianconnectionwerestudiedbyGeorgeandFoster
[12],Zhang[20],JohnstoneandSilverman[16,17],Abramovich,Grinshtein

WorkpartlysupportedbyaPostdoctoralfellowshipfromtheVUUniversityAmster-
madAMS2000subjectclassifications:
Primary62G05,62G20
Keywordsandphrases:
Bayesianestimators,Sparsity,Gaussiansequencemodel,Mix-
turepriors,Asymptotics,Contraction.
1imsart-aosver.2007/12/10file:spa-revised.texdate:November8,2011

2
I.CASTILLOANDA.W.VANDERVAART
andPensky[2],andJiangandZhang[15].Thepapers[12]and[16]con-
sideredanempiricalBayesmethod,consistingofmodellingtheparameters
θ
1
,...,θ
n
a-prioriasindependentlydrawnfromamixtureofaDiracmea-
sureat0andacontinuousdistribution,determininganappropriatemixing
weightbythemethodof(restricted)marginalmaximumlikelihood,and
finallyemployingtheposteriormedianormean.Thesecondpaper[2]moti-
vatedpenalties,appliedinapenalizedminimumcontrastscheme,byprior
distributionsontheparameters,andderivedestimatorsforthenumberof
nonzero
θ
i
andthe
θ
i
itself.Thefirstisaposteriormode,buttheestimator
for
θ
,called“Bayesiantestimation”,doesnotseemitselfBayesian.(Infact,
theGaussianpriorforthenon-zeroparametersin[2]willbeseentoperform
suboptimallyinourfullyBayesianset-up.)Thepapers[20]and[15]obtain
sharpresultson(nonparametric)empiricalBayesestimators.
Otherrelatedpapersinclude[19],[6],[7],[14],[15],[5].
Apenalizedminimumcontrastestimatorcanoftenbeviewedasthemode
oftheposteriordistribution,anditishelpfultointerpretepenaltiesaccord-
ingly.However,theBayesianapproachyieldsafullposteriordistribution,
whichisarandomprobabilitydistributionontheparameterspace.Ithas
bothalocationandaspread,andcanbemarginalizedtogiveposterior
distributionsforanyfunctionsoftheparametervectorofinterest.Itisthis
objectthatwestudyinthispaper.SuchfullBayesianinferencewasrecently
consideredbyScottandBerger[18],whodiscussedvariousaspectsnotcov-
eredinthepresentpaper,butnoconcentrationresults.Oneexampleofour
resultsisthatthebeta-binomialpriorsin[18],combinedwithmoderately
toheavytailedpriorsonthenonzeromeans,yieldoptimalrecovery.
Sparsity
canbedefinedinvariousways.Perhapsthemostnaturaldefini-
tionistheclassof
nearlyblack
vectors,definedas
`
0
[
p
n
]=
{
θ

R
n
:#(1

i

n
:
θ
i
6
=0)

p
n
}
.
Here
p
n
isagivennumber,whichintheoreticalinvestigationsistypically
assumedtobe
o
(
n
),as
n
→∞
.Sparsitymayalsomeanthatmanymeans
aresmall,butpossiblynotexactlyzero.Definitionsthatmakethisprecise
use
strong
or
weak
`
s
-balls
,typicallyfor
s

(0
,
2).Thesearedefinedas,with
θ
[1]

θ
[2]
≥∙∙∙≥
θ
[
n
]
thenonincreasingpermutationofthecoordinatesof
θ
=(
θ
1
,...,θ
n
),
nX
n
o
`
s
[
p
n
]=
θ

R
n
:1
|
θ
i
|
s

p
ns
nn1=in
n
1
s

p
n

s
o
m
s
[
p
n
]=
θ

R
:
n
1

m
i
a

x
n
i
|
θ
[
i
]
|≤
n.
imsart-aosver.2007/12/10file:spa-revised.texdate:November8,2011

SPARSITYANDBAYESPOSTERIORMEASURE
3
Becausethenonzerocoefficientsin
`
0
[
p
n
]arenotquantitativelyrestricted,
thereisnoinclusionrelationshipbetweenthisspaceandtheweakandstrong
balls,althoughresultsforthelattercanbeobtainedbyprojectingthem
into
`
0
[
p
n
].Ontheotherhand,forany
s>
0wehavetheinclusion
`
s
[
p
n
]

m
s
[
p
n
].
Theextentofthesparsity,measuredbytheconstant
p
n
,isassumedun-
known.OurBayesianapproachstartsbyputtingaprior
π
n
onthisnumber,
agivenprobabilitymeasureontheset
{
0
,
1
,
2
,...,n
}
.Nextwecomplete
thistoaprioronthesetofallpossiblesequences
θ
=(
θ
1
,...,θ
n
)in
R
n
,
bygivenadraw
p
from
π
n
choosingarandomsubset
S
⊂{
1
,...,n
}
of
cardinality
p
,andchoosingthecorrespondingcoordinates(
θ
i
:
i

S
)from
adensity
g
S
on
R
S
andsettingtheremainingcoordinates(
θ
i
:
i

S
c
)equal
tozero.Giventhisprior,Bayes’ruleyieldstheposteriordistributionof
θ
asusual.Weinvestigatethepropertiesofthisposteriordistribution,inits
dependenceonthepriorsonthedimensionandonthenonzerocoefficients,
inthenonBayesianset-upwhere
X
follows(1.1)with
θ
equaltoafixed,
“true”parameter
θ
0
.
Ifthetrueparametervector
θ
0
belongsto
`
0
[
p
n
],thenitisdesirablethat
theposteriordistributionconcentratesmostofitsmassonnearlyblack
vectors.Onemainresultofthepaperisthatthisisthecaseprovidedthe
priorprobabilities
π
n
{
p
}
decreaseexponentiallyfastwiththedimension
p
.
Thequalityofthereconstructionofthefullvector
θ
canbemeasuredby
variousdistances.AnaturaloneistheEuclideandistance,withsquare
nXk
θ

θ
0
k
2
=(
θ
i

θ
i
0
)
2
.
1=iIftheindicesofthe
p
n
nonzerocoordinatesofavectorinthemodel
`
0
[
p
n
]
wereknowna-priori,thenthevectorcouldbeestimatedwithmeansquare
erroroftheorder
p
n
.In[11]itisshownthat,as
n,p
n
→∞
with
p
n
=
o
(
n
),
2in
ˆ
fsup
P
n,θ
k
θ
ˆ

θ
k
=2
p
n
log(
n/p
n
)1+
o
(1)
.
θθ

`
0
[
p
n
]
Heretheinfimumistakenoverallestimators
θ
ˆ=
θ
ˆ(
X
)and
P
n,θ
denotes
takingtheexpectationundertheassumptionthat
X
is
N
n
(
θ,I
)-distributed.
Inotherwords,thesquareminimaxrateover
`
0
[
p
n
]is
p
n
log(
n/p
n
),meaning
thattheunknownidentityofthenonzeromeansneedstoleadonlytoa
logarithmicloss.
TheBayesianapproachispresumablyadoptedfortheintuitionprovided
bypriormodelling,andisnotnecessarilydirectedatattainingminimax

imsart-aosver.2007/12/10file:spa-revised.texdate:November8,2011

4
I.CAST

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents