GENERALIZATION OF AN INEQUALITY BY TALAGRAND AND LINKS WITH THE LOGARITHMIC SOBOLEV INEQUALITY
37 pages
English

GENERALIZATION OF AN INEQUALITY BY TALAGRAND AND LINKS WITH THE LOGARITHMIC SOBOLEV INEQUALITY

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
37 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

GENERALIZATION OF AN INEQUALITY BY TALAGRAND, AND LINKS WITH THE LOGARITHMIC SOBOLEV INEQUALITY F. OTTO AND C. VILLANI Abstract. We show that transport inequalities, similar to the one derived by Talagrand [30] for the Gaussian measure, are im- plied by logarithmic Sobolev inequalities. Conversely, Talagrand's inequality implies a logarithmic Sobolev inequality if the density of the measure is approximately log-concave, in a precise sense. All constants are independent of the dimension, and optimal in certain cases. The proofs are based on partial differential equations, and an interpolation inequality involving the Wasserstein distance, the entropy functional and the Fisher information. Contents 1. Introduction 1 2. Main results 5 3. Heuristics 10 4. Proof of Theorem 1 18 5. Proof of Theorem 3 24 6. An application of Theorem 1 29 7. Linearizations 31 Appendix A. A nonlinear approximation argument 34 References 35 1. Introduction Let M be a smooth complete Riemannian manifold of dimension n, with the geodesic distance (1) d(x, y) = inf ? ? ? √∫ 1 0 |w˙(t)|2 dt, w ? C1((0, 1);M), w(0) = x, w(1) = y ? ? ? . 1

  • dµ d?

  • measure ?

  • d?

  • sobolev inequality

  • gross's logarithmic

  • gotze also

  • d? ?

  • gaussian measure


Sujets

Informations

Publié par
Nombre de lectures 13
Langue English

Exrait

GENERALIZATION OF AN INEQUALITY BY
TALAGRAND, AND LINKS WITH THE
LOGARITHMIC SOBOLEV
F. OTTO AND C. VILLANI
Abstract. We show that transport inequalities, similar to the
one derived by Talagrand [30] for the Gaussian measure, are im-
plied by logarithmic Sobolev inequalities. Conversely, Talagrand’s
inequalityimpliesalogarithmicSobolevinequalityifthedensityof
the measure is approximately log-concave, in a precise sense. All
constantsareindependentofthedimension,andoptimalincertain
cases. The proofs are based on partial differential equations, and
an interpolation inequality involving the Wasserstein distance, the
entropy functional and the Fisher information.
Contents
1. Introduction 1
2. Main results 5
3. Heuristics 10
4. Proof of Theorem 1 18
5. Proof of 3 24
6. An application of Theorem 1 29
7. Linearizations 31
Appendix A. A nonlinear approximation argument 34
References 35
1. Introduction
Let M be a smooth complete Riemannian manifold of dimension n,
with the geodesic distance
(1) 8 9s
Z< =1
12d(x;y)=inf jw˙(t)j dt; w2C ((0;1);M); w(0)=x; w(1)=y :
: ;0
12 F. OTTO AND C. VILLANI
We define the Wasserstein distance, or transportation distance with
quadratic cost, between two probability measures „ and ” on M, by
s Z
p
2(2) W(„;”)= T („;”)= inf d(x;y) d…(x;y);2
…2Π(„;”) M£M
where Π(„;”) denotes the set of probability measures on M£M with
marginals „ and ”, i.e. such that for all bounded continuous functions
f and g on M,
Z Z Z
£ ⁄
d…(x;y) f(x)+g(y) = fd„+ gd”:
M£M M M
Equivalently,
n op
2W(„;”)=inf Ed(X;Y) ; law(X)=„; law(Y)=” ;
where the infimum is taken over arbitrary random variables X and Y
on M. This infimum is finite as soon as „ and ” have finite second
moments, which we shall always assume.
The Wasserstein distance has a long history in probability theory
and statistics, as a natural way to measure the distance between two
probability measures in weak sense. As a matter of fact, W metrizes
the weak-* topology on P (M), the set of probability measures on M2
with finite second moments. More precisely, if („ ) is a sequence ofn
probability measures on M such that for some (and thus any) x 2M,0
Z
2lim sup d(x ;x) d„ (x)=0;0 n
R!1 n d(x ;x)‚R0
then W(„ ;„)¡!0 if and only if „ ¡!„ in weak measure sense.n n
Striking applications of the use of this and related metrics were re-
centlyput forwardin worksbyMarton [21] and Talagrand [30]. There,
Talagrandshowshowtoobtainrathersharpconcentrationestimatesin
a Gaussian setting, with a completely elementary method, which runs
as follows. Let
2¡jxj =2e
d (x)= dx
n=2(2…)
denote the standard Gaussian measure. Talagrand proved that for any
nprobability measure „ on R , with density h = d„=d with respect
to ?,
s sZ Z
(3) W(„;?)• 2 hloghd = 2 loghd„:
n n
RRON AN INEQUALITY BY TALAGRAND 3
nNow, let B ‰R be a measurable set with positive measure ?(B),
and for any t>0 let
nB =fx2R ; d(x;B)•tg:t
nHere d(x;B) = inf kx¡ yk . Moreover, let ?j denote the re-y2B B
striction of ? to B, i.e. the measure (1 = (B))d . A straightforwardB
computation, using (3) and the triangle inequality for W, yields the
estimate
s s
¡ ¢ 1 1
W ?j ;?j • 2log + 2log :nB nBt ?(B) 1¡?(B )t
Since, obviously, this distance is bounded below by t, this entails
2
1 1
¡ t¡ 2log
2 ?(B)(4) ?(B )‚1¡e :t
In words, the measure of B goes rapidly to 1 as t grows : this is at
standard result in the theory of the concentration of measure in Gauss
space, which can also be derived from the Gaussian isoperimetry.
Talagrand’s proof of (3) is completely elementary; after establishing
it in dimension 1, he proceeds by induction on the dimension, taking
advantageofthetensorizationpropertiesofboththeGaussianmeasure
and the entropy functional E(hlogh). His proof is robust enough to
yieldacomparableresultinthemoredelicatecaseofatensorproductof
¡ jx jiexponentialmeasure: e dx :::dx ,withacomplicatedvariantof1 n
theWassersteinmetric. BobkovandG otzealsorecoveredinequality(3)
as a consequence of the Pr´ekopa-Leindler inequality, and an argument
due to Maurey [22].
In this paper, we shall give a new proof of inequality (3), and gen-
eralize it to a very wide class of probability measures : namely, all
probability measures ” (on a Riemannian manifold M) satisfying a
logarithmic Sobolev inequality, which means
Z ?Z ¶ ?Z ¶ Z
21 jrhj
(5) hloghd”¡ hd” log hd” • d”;
2‰ hM M M M
holding for all (reasonably smooth) functions h on M, with some fixed
‰>0. Letusrecallthat(5)isobviouslyequivalent,atleastforsmooth
h, to the (maybe) more familiar form
Z ?Z ¶ ?Z ¶ Z
22 2 2 2 2g logg d”¡ g d” log g d” • jrgj d”:
‰M M M M
nIn the case M =R , ” = ?, ‰ = 1, this is Gross’s logarithmic Sobolev
inequality,andweshallprovethatitimpliesTalagrand’sinequality(3).
PRqR4 F. OTTO AND C. VILLANI
As we realized after this study, this implication was conjectured by
Bobkov and G otze in their recent work [5]. But we wish to emphasize
the generality of our result : in fact we shall prove that (5) implies an
inequality similar to (3), only with the coefficient 2 replaced by 2=‰.
Thisresult is in generaloptimal, as showsthe example of the Gaussian
measure. By known results on logarithmic Sobolev inequalities, it also
entails immediately that inequalities similar to (3) hold for (not nec-
¡Ψ(x)¡ˆ(x) nessarily product) measures e dx onR (resp. on a manifold
2M) such that ˆ is bounded and the Hessian D Ψ is uniformly positive
2definite (resp. D Ψ + Ric, where Ric stands for the Ricci curvature
tensor on M).
This implication fits very well in the general picture of applications
of logarithmic Sobolev inequalities to the concentration of measure, as
developed for instance in [19].
Then, a natural question is the converse statement : does an in-
equalitysuchas(3) imply (5)? The answer is known to be positivefor
nmeasuresonR thatarelogconcave,orapproximately: thiswasshown
by Wang, using exponential integrability bounds. But we shall present
acompletelydifferentproof,basedonaninformation-theoreticinterpo-
lation inequality, which is apparently new and whose range of applica-
tions is certainly very broad. It was used by the first author in [26] for
the study of the long-time behaviour of some nonlinear PDE’s. One
interest of this proof is to provide bounds which are dimension-free,
and in fact optimal in certain regimes, thus qualitatively much better
than those already known.
Our arguments are mainly based on partial differential equations.
ThispointofviewwasalreadysuccessfullyusedbyBakryandEmery[3]
to derive simple sufficient conditions for logarithmic Sobolev inequali-
ties (see also the recent exhaustive study by Arnold et al. [1]), and will
appear very powerful here too – in fact, our proofs also imply the main
results in [3].
Note added in proof : After our main results were announced,
S. Bobkov and M. Ledoux gave alternative proofs of Theorem 1 below,
based on an argument involving the Hamilton-Jacobi equation.
Acknowledgement: ThesecondauthorthanksA.Arnold,F.Barthe
and A. Swiech for discussions on related topics, and especially M. Le-
douxforprovidinghislecturenotes[19](whichmotivatedthiswork),as
wellasdiscussingthequestionsaddressedhere. Bothauthorsgratefully
acknowledge stimulating discussions with Y. Brenier and W. Gangbo.
Part of this work was done when the second author was visiting theON AN INEQUALITY BY TALAGRAND 5
University of Santa Barbara, and part of it when he was in the Univer-
sityofPavia; themainresultswerefirstannouncedinNovember,1998,
on the occasion of a seminar in Georgia Tech. It is a pleasure to thank
all of these institutions for their kind hospitality. The first author also
acknowledges support from the National Science Foundation and the
A. P. Sloan Research Foundation.
2. Main results
We shall always deal with probability measures that are absolutely
continuous w.r.t. the standard volume measure dx on the (smooth,
complete)manifoldM,andsometimesidentifythemwiththeirdensity.
¡Ψ(x)We shall fix a “reference” probability measure d” = e dx, and
assume enough smoothness on Ψ : say, Ψ is twice differentiable. As far
nas we know, the most important cases of interest are (a) M =R , (b)
M has finite volume, normalized to unity, and d” =dx (so Ψ=0). An
interesting limit case of (a) is d” = dxj , where B is a closed smooth
B
nsubset of R . Depending on the cases of study, many extensions are
possible by approximation arguments.
Let d„ = fdx, we define its relative entropy with respect to d” =
¡Ψe dx by
Z Z
d„ d„ d„
(6) H(„j”)= log d„= log d”
d” d” d”M M
or equivalently by
Z
¡Ψ(7) H(fje )= f(logf +Ψ)dx:
M
Next, we define the relative Fisher information by
fl flrfl fl 2Z Z2 fl flfl fld„ d„fl flfl fl(8) I(„j”)= rlog d„=4 r d”fl flfl fl fl fld” d”M M
or equivalently by
Z
2¡Ψ(9) I(fje )= fjr(logf +Ψ)j dx:
M
2Here j¢j denotes the square norm in the Riemannian structure on
M, and r is the gradient on M. The relative Fisher information is
well-defined on [0;+1] by the expression in the right-hand side of (8).
The relative entropy is also well-defined in [0;+1], for instance by the
expression ? ¶Z
d„ d„ d„
log ¡ +1 d”;
d” d” d”M6 F. OTTO AND C. VILLANI
which is the integral of a nonnegative function.
Definition1. Theprobabilitymeasure” satisfiesalogarithmicSobolev
inequality with constant ‰ > 0 (in short : LSI(‰)) if for all probability
measure „ absolutely continuous w.r.t. ”,
1
(10) H(„j”)• I(„j”):
2‰
Thisdefinitionisequivalentto(5),sinceherewerestricttomeasures
„=h” which are probability measures.
Definition2. The probability measure ” satisfies a Talagrand inequal-
ity with constant ‰ > 0 (in short : T(‰)) if for all probability measure
„, absolutely continuous w.r.t. ”, with finite moments of order 2,
s
2H(„j”)
(11) W(„;”)• :

By combining (10) and (11), we also naturally introduce the
Definition 3. The probability measure ” satisfies LSI+T(‰) if for all
probability measure „, absolutely continuous w.r.t. ” and with finite
moments of order 2,
p1
(12) W(„;”)• I(„j”):

Our first result states that (10) is stronger than (11), and thus
than (12) as well. Below, I denotes the identity matrix of order n,n
and Ric stands for the Ricci curvature tensor on M.
¡ΨTheorem 1. Let d” = e dx be a probability measure with finite
2 2moments of order 2, such that Ψ2 C (M) and D Ψ+ Ric ‚¡CI ,n
C2R. If ” satisfies LSI(‰) for some ‰>0, then it also satisfies T(‰),
and (obviously) LSI+T(‰).
2Remark.TheassumptiononD Ψ+ Ric isthereonlytoavoidpatho-
logical situations, and ensure uniform bounds on the solution of a re-
lated PDE. For all the cases of interest known to the authors, it is not
a restriction. The value of C plays no role in the results.
Wenowrecallasimplecriterionfor” tosatisfyalogarithmicSobolev
inequality. This is the celebrated result of Bakry and Emery.
¡ΨTheorem 2. (Bakry and Emery [3]) Let d” =e dx be a probabil-
2 2ity measure on M, such that Ψ2C (M) and D Ψ+Ric‚‰I , ‰>0.n
Then d” satisfies LSI(‰).ON AN INEQUALITY BY TALAGRAND 7
As is well-known since the work of Holley and Stroock [16], if ”
¡ˆsatisifies a LSI(‰), and ”˜ = e ” is a “bounded perturbation” of ”
1(this means ˆ 2 L , and ”˜ is a probability measure) then ”˜ satisfies
¡osc(ˆ)LSI(‰˜) with ‰˜=‰e , osc(ˆ) =supˆ¡infˆ. This simple lemma
allows to extend considerably the range of probability measures which
are known to satisfy a logarithmic Sobolev inequality.
Next, we are interested in the converse implication to Theorem 1. It
will turn out that it is actually a corollary of a general “interpolation”
inequality between the functionals H, W and I.
¡Ψ nTheorem 3. Let d” = e dx be a probability measure on R , with
2 n 2finite moments of order 2, such that Ψ2C (R ), D Ψ‚KI , K2Rn
n(not necessarily positive). Then, for all probability measure „ on R ,
absolutely continuous w.r.t. ”, holds the following “HWI inequality” :
p K 2(13) H(„j”)•W(„;”) I(„j”)¡ W(„;”) :
2
Remarks.
(1) In particular, if Ψ is convex, then
p
(14) H(„j”)•W(„;”) I(„j”):
(2) Formally, it is not difficult to adapt our proof to a general Rie-
2 2mannian setting, with D Ψ replaced by D Ψ + Ric in the
assumptions of Theorem 3 (and of its corollaries stated below).
However, a rigorous proof requires some preliminary work on
the Wasserstein distance on manifolds, which is of independent
interest and will therefore be examined elsewhere. At the mo-
ment, using the results of [12], we can only obtain the same
¡Ψresults when d” =e dx is a probability measure on the torus
n n n˜T , where Ψ is the restriction to T of a function Ψ on R ,
2˜D Ψ‚KI .n
(3) ByYoung’sinequality,ifK >0,inequality(13)impliesLSI(K).
Thus, this inequality contains the Bakry-Emery result (at least
nin R ). Moreover, the cases of equality for (13) are the same
than for LSI(K). By the way, this shows that the constant 1
(in front of the right-hand side of (13)) is optimal for K >0.
(4) In any case, we have, for any ‰>0,
1 ‰¡K 2H(„j”)• I(„j”)+ W(„;”) :
2‰ 2
This tells us that LSI(‰) is always satisfied (for any ‰), up to
a “small error”, i.e. an error term of second order in the weak
topology.8 F. OTTO AND C. VILLANI
Let us now enumerate some immediate consequences of Theorems 1,
2, 3. As a corollary of Theorem 1, we find
Corollary 1.1. Under the assumptions of Theorem 1, for all measur-p
able set B‰M, and t‚ (2=‰)log(1=”(B)), one has
2
‰ 2 1
¡ t¡ log
2 ‰ ”(B)(15) ”(B )‚1¡e :t
This inequality was already obtained by Bobkov and G otze [5].
Next, as a corollary of Theorem 2 and Theorem 1, we obtain
¡ΨCorollary 2.1. Let d” = e dx be a probability measure on M with
2 2finite moments of order 2, such that Ψ 2 C (M), D Ψ+Ric ‚ ‰I ,n
‰>0. Then T(‰) holds.
Andactually,usingtheHolley-Stroockperturbationlemma,wecome
up with the stronger statement
¡Ψ¡ˆCorollary2.2. Letd” =e dxbeaprobabilitymeasureonM,with
2 2finite moments of order 2, such that Ψ 2 C (M), D Ψ+Ric ‚ ‰I ,n
1 ¡osc(ˆ)‰>0, ˆ2L . Then T(‰˜) holds, ‰˜=‰e .
WenowstatetwocorollariesofTheorem3. Thefirstonemeansthat
under some “convexity” assumption, T ) LSI, maybe at the price of
a degradation of the constants :
¡Ψ nCorollary 3.1. Let d” = e dx be a measure on R , with Ψ 2R
2 n ¡Ψ(x) 2 2C (R ), e jxj dx< +1, and D Ψ‚KI , K2R. Assume thatn
” satisfies T(‰) with ‰ ‚ max(0;¡K). Then ” also satisfies LSI(‰˜)
with " #? ¶2
‰ K
‰˜=max 1+ ;K :
4 ‰
In particular, if K > 0, ” satisfies LSI(K); and if Ψ is convex, ”
satisfies LSI(‰=4).
Remark. This result is sharp at least for K >0.
AnothervariantistheimplicationLSI+T )LSI,orLSI+T )T.
Quite surprisingly, these implications essentially always hold true, in
2fact as soon as D Ψ is bounded below by any real number.
¡Ψ nCorollary 3.2. Let d” = e dx be a measure on R , with Ψ 2R
2 n ¡Ψ(x) 2 2C (R ), e jxj dx < +1, and D Ψ ‚ KI , K 2 R. Assumen
that ” satisfies LSI+T(‰) with ‰ ‚ max(K;0). Then ” also satisfies
LSI(‰˜), and thus also T(‰˜), with

‡ ·‰˜= :
K2¡

qON AN INEQUALITY BY TALAGRAND 9
Remark. Again, this is sharp for ‰=K >0.
Proof of Corollaries 3.1 and 3.2. LetususetheshorthandsW =W(„;”),
H =H(„j”), I =I(„j”), and assume that all these quantities are pos-
itive (if not, there is nothing to prove).
Bydirectresolution,andsinceW cannotbenegative,inequality(13)p p
implies W ‚ ( I ¡ I¡2KH)=K if K = 0 (and I ¡2KH has top
be nonnegative if K > 0), and W ‚ H= I if K = 0. In all the cases,p
using W • (2H=‰), this leads, if ‰‚¡K, to
2I
H • ;‡ ·2
K‰ 1+

which is the result of Corollary 3.1. (Of course, if K •‰, then ‰˜is no
improvement of ‰.)
The proof of Corollary 3.2 follows the same lines. ⁄
Without any convexity assumption on Ψ, it seems likely that the
implication T(‰)) LSI(‰˜) fails, although we do not have a counterex-
ample up to present. On the other hand, as will be shown in the
last section, the Talagrand inequality is still strong enough to imply a
Poincar´e inequality.
Let us comment further on our results. First, define the transporta-
tion distance with linear cost, or Monge-Kantorovich distance,
Z
T („;”)= inf d(x;y)d…(x;y):1
…2Π(„;”) M£M
By Cauchy-Schwarz inequality, T • T = W. Bobkov and G otze [5]1 2
prove, actually in a more general setting than ours, that an inequality
(referred to below as T (‰)) of the form1
s
2H(„j”)
T („;”)• ;1

holding for all probability measures „ absolutely continuous w.r.t. ”
and with finite second moments, is equivalent to a concentration in-
equality of the form
Z
2‚F ‚ F d”+‚ =(2‰)(16) e d”•e ;
M
holding for all Lipschitz functions F on M with kFk • 1. SuchLip
a concentration inequality can also be seen as a consequence of the
logarithmic Sobolev inequality LSI(‰). So our results extend theirs by
R610 F. OTTO AND C. VILLANI
showing the stronger inequality for W. In short, LSI(‰) ) T(‰) )
T (‰)) (16)) (15).1
By the arguments of [5], another consequence of Theorem 1 is that
if ” satisfies LSI(‰), then for all measurable functions f on M,
Z
‰Sf ‰ fd”e d”•e ;
M
1 2with Sf(x) · inf [f(y)+ d(x;y) ]. Indeed, this is a consequencey2M 2
of the general identities (the first of which is a special case of the
Kantorovich duality)
‰Z Z ?
1 2W(„;”) = sup Sfd„¡ fd” ;
2 f2C (M)b
‰Z Z ?
’H(„j”)= sup ’d„¡log e d” :
’2C (M)b
Theproofsof[5]areessentiallybasedonfunctionalcharacterizations
similar to the ones above, and thus completely different from our PDE
tools. This explains our more restricted setting : we need a differen-
tiable structure.
Toconcludethispresentation,wementionthat,veryrecently,G.Blower
communicated to us a direct (independent) proof of Corollary 2.1, in
the Euclidean setting (this is part of a work in preparation, on the
Gaussian isoperimetric inequality and its links with transportation).
His argument does not involve logarithmic Sobolev inequalities nor
partial differential equations. One drawback of this approach is that
perturbation lemmas for Talagrand inequalities seem much more deli-
cate to obtain, than perturbation lemmas for logarithmic Sobolev in-
equalities (due to the nonlocal nature of the Wasserstein distance). In
section 5, we briefly reinterpret Blower’s proof within our framework.
3. Heuristics
In this section, we shall explain how the inequalities T, LSI, HWI
and the Bakry–Emery condition have a simple and appealing interpre-
tation in terms of a formal Riemannian setting. This formalism, which
is somewhat reminiscent of Arnold’s [2] geometric viewpoint of fluid
mechanics, was developed by the first author in [26]. It gives precious
methodological help in various situations. We wish to make it clear
that we consider it only as a formal tool, and that the arguments in
this section will not be used anywhere else in the paper. Hence, the
reader interested only in the proofs (and not in the ideas) may skip
this section.
R

  • Accueil Accueil
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • BD BD
  • Documents Documents