Mixture Models Simplification Comparison
28 pages
Français

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Mixture Models Simplification Comparison

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
28 pages
Français
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Niveau: Supérieur
Mixture Models Simplification Comparison Simplifying and comparing Kernel Density Estimators Journée Transport Optimal et Géométrie de l'information IHP, Paris, France Olivier Schwander Frank Nielsen joint work with A. Sim, M. Levitt, J. Bernauer, A. Schutz, Y. Berthoumieu École Polytechnique Sony CSL February 10, 2012 Olivier Schwander Simplifying and comparing Kernel Density Estimators

  • parzen windows methods

  • mixture models

  • clustering experiments

  • kernel density

  • hard clustering

  • simplification comparison

  • model hard


Sujets

Informations

Publié par
Nombre de lectures 67
Langue Français

Extrait

Mixture Models
Simplification
Comparison
Simplifying and comparing Kernel Density
Estimators
Journée Transport Optimal et Géométrie de l’information
IHP, Paris, France
Olivier Schwander Frank Nielsen
joint work with A. Sim, M. Levitt, J. Bernauer, A. Schutz, Y. Berthoumieu
École Polytechnique
Sony CSL
February 10, 2012
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models
Simplification
Comparison
Outline
Mixture Models
Statistical mixtures
Getting
Simplification
Bregman Hard Clustering
Model Hard Clustering
Experiments
Comparison
Majoring Kullback-Liebler
Other ground distances
Comparing and EMD
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models
Statistical mixtures
Simplification
Getting
Comparison
Mixture models
Mixture
P
I Pr(X = x) = ! Pr(X = xj; )i i ii
I each Pr(X = xj; ) is a probability density functioni i
Famous special case
Gaussian Mixtures Models (GMM)
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models
Statistical mixtures
Simplification
Getting
Comparison
Getting mixtures
Expectation-Maximization
Kernel density estimation
I or Parzen windows methods
I one Kernel by data point (often a Gaussian kernel)
I fixed bandwidth
Olivier Schwander Simplifying and comparing Kernel Density Estimators
2502000.00220001000.0041000.0062000.00800.0101500.01250150250100150505000.000Mixture Models
Statistical mixtures
Simplification
Getting
Comparison
Why simplification ?
A lot of components
I 120120 = 14400 Gaussian in the previous curve
KDE: good approximation but
I very large mixture: time and memory problems
I low number of components is often enough (EM)
EM: small approximation but
We may want a fixed number of components without learning a
new mixture
I EM is slow
I we don’t have the original dataset, just the model
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models Bregman Hard Clustering
Simplification Model Hard
Comparison Experiments
k-means
Olivier Schwander Simplifying and comparing Kernel Density Estimators
200250150502000.0080.0021000.00400.0060.0120.0082000.01000.01210000.006500.0101001001500200150250503002500.0001500.002500.0040.000Mixture Models Bregman Hard Clustering
Simplification Model Hard
Comparison Experiments
k-means
What do we need ?
I A distance (or a divergence, or a dissimilarity measure)
I A centroid
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models Bregman Hard Clustering
Simplification Model Hard
Comparison Experiments
Kullback-Liebler divergence
Divergence
Z
p(x)
I D(PkQ) = p(x)log dx
q(x)
I Not a symmetric divergence
Centroids
P
I Left-sided one: min ! B (x;p )x i F iiP
I Right-sided one: min ! B (p;x)x i F ii
I Various symmetrizations !
I Known in closed-form
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models Bregman Hard Clustering
Simplification Model Hard
Comparison Experiments
Fisher divergence
Riemannian metric on the statistical manifold
Fisher information matrix

@ @~ ~g = I (; ) = E logp(X ;) logp(X ;)ij i j @ @i j
XX
ds = g d dij i j
Olivier Schwander Simplifying and comparing Kernel Density EstimatorsMixture Models Bregman Hard Clustering
Simplification Model Hard
Comparison Experiments
Fisher divergence formula
Known for 0-mean Gaussian
I Not really interesting for general case mixtures...
I Open problem for others cases
For 1D data
I Poincaré hyperbolic distance in the Poincaré upper half-plane
FRD(f ;f ) =p q

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents