em-tutorial-bilmes98gentle

Chaun

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

15 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

Informations

Publié par	Chaun
Nombre de lectures	43
Langue	English

Extrait

,
F

TIONAL
Berk
COMPUTER
643-7684
SCIENCE
643-9153
INSTITUTE
(510)
I
eley
1947
Califo
Center
94704-1198
St.
(510)

Suite
AX
600
rnia

INTERNA
A Gentle Tutorial of the EM Algorithm
and its Application to Parameter
Estimation for Gaussian Mixture and
Hidden Markov Models
Jeff A. Bilmes (bilmes@cs.berkeley.edu)
International Computer Science Institute
Berkeley CA, 94704
and
Computer Science Division
Department of Electrical Engineering and Computer Science
U.C. Berkeley
TR 97 021
April 1998
Abstract
We describe the maximum likelihood parameter estimation problem and how the Expectation
Maximization (EM) algorithm can be used for its solution. We ﬁrst describe the abstract
form of the EM algorithm as it is often given in the literature. We then develop the EM pa
rameter estimation procedure for two applications: 1) ﬁnding the parameters of a mixture of
Gaussian densities, and 2) ﬁnding the parameters of a hidden Markov model (HMM) (i.e.,
the Baum Welch algorithm) for both discrete and Gaussian mixture observation models.
We derive the update equations in fairly explicit detail but we do not prove any conver-
gence properties. We try to emphasize intuition rather than mathematical rigor.ii)
f
jX
Z
j
=
)
(
L
X
N
;
L
Y
p
)
p
p
:
(
i
z
p
j
p
)
jX
=
log
p
(
(
x
x
(
;

y
(
j
jX
)
)
=
X
p
1
(
i

)
))
X
p
))
(
(
x
(
j
:
)
jX
jX
)
(
(
L
j
(
p
log

)
x
2
)

N
;
L
(
)
=
(

=
)
j

x
j
=
)
x
x
(
(
=1
p
Y
)
=
y
(
j
x
;
1 Maximum likelihood
Recall the deﬁnition of the maximum likelihood estimation problem. We have a density function
that is governed by the set of parameters (e.g., might be a set of Gaussians and could
be the means and covariances). We also have a data set of size , supposedly drawn from this
distribution, i.e.,
;:::;
x
g . That is, we assume that these data vectors are independent and
N
identically distributed (i.i.d.) with distribution . Therefore, the resulting density for the samples is
Xj
This function is called the likelihood of the parameters given the data, or just the likelihood
function. The likelihood is thought of as a function of the parameters where the data is ﬁxed.
In the maximum likelihood problem, our goal is to ﬁnd the that maximizes . That is, we wish
to ﬁnd where
argmax
Often we maximize instead because it is analytically easier.
Depending on the form of this problem can be easy or hard. For example, if
is simply a single Gaussian distribution where , then we can set the derivative of
to zero, and solve directly for and (this, in fact, results in the standard formulas
for the mean and variance of a data set). For many problems, however, it is not possible to ﬁnd such
analytical expressions, and we must resort to more elaborate techniques.
2BasicEM
The EM algorithm is one such elaborate technique. The EM algorithm [ALR77, RW84, GJ95, JJ94,
Bis95, Wu83] is a general method of ﬁnding the maximum likelihood estimate of the parameters of
an underlying distribution from a given data set when the data is incomplete or has missing values.
There are two main applications of the EM algorithm. The ﬁrst occurs when the data indeed
has missing values, due to problems with or limitations of the observation process. The second
occurs when optimizing the likelihood function is analytically intractable but when the likelihood
function can be simpliﬁed by assuming the existence of and values for additional but missing (or
hidden) parameters. The latter application is more common in the computational pattern recognition
community.
As before, we assume that data is observed and is generated by some distribution. We call
the incomplete data. We assume that a complete data set exists and also assume (or
specify) a joint density function:
.
Where does this joint density come from? Often it “arises” from the marginal density function
and the assumption of hidden variables and parameter value guesses (e.g., our two exam
ples, Mixture densities and Baum Welch). In other cases (e.g., missing data values in samples of a
distribution), we must assume a joint relationship between the missing and observed values.
1
X

2
j
x
(
p

L
=

L
X

pX
(
1)
(
i
y
jX
;
log
1)
;
)

=
1)
f
y
(

y
X
jX
)
;
)

i
(
0
i
log
1)
;
)
f
f
1
L
i
(
;
jZ
)
)
i
=
f
L
h
(
(
dy
i
:
=
)

1)
)
i
)
(
(

E
;
(
(
L
(
y
)

1)
(
)
X
f
(
X
i
(
jX
1)
(
)
E
h
p
(
1)
1)

Y
Y
f
Y
Y
=
(
X
y

)
Y
)]
=
=
1)
R
(
y
)

(
X
0
)

1)
;
i
jX
(
;

Y
;
p
jX
h
y
)
(
i
Y

f
jX
:
(
y
)
d
(
)
Y
1)
1)
i
h
(
X

Q
;

jX
Q
y
;
(
1)
f
(
)
;
j
)
y
X
;
p
X
h
(
=
p
log
log
(

;
2
)
y
(
Z
;
L

(
jX
;
With this new density function, we can deﬁne a new likelihood function,
Yj , called the complete data likelihood. Note that this function is in fact a random variable
since the missing information is unknown, random, and presumably governed by an underlying
distribution. That is, we can think of for some function where
and are constant and is a random variable. The original likelihood is referred to as the
incomplete data likelihood function.
The EM algorithm ﬁrst ﬁnds the expected value of the complete data log likelihood
Yj
with respect to the unknown data given the observed data and the current parameter estimates.
That is, we deﬁne:
Yj (1)
Where are the current parameters estimates that we used to evaluate the expectation and
are the new parameters that we optimize to increase .
This expression probably requires some explanation. The key thing to understand is that
and are constants, is a normal variable that we wish to adjust, and is a random
variable governed by the distribution . The right side of Equation 1 can therefore be
re written as:
Yj (2)
Note that is the marginal distribution of the unobserved data and is dependent on
both the observed data and on the current parameters, and is the space of values can take on.
In the best of cases, this marginal distribution is a simple analytical expression of the assumed pa
(
irameters
and perhaps the data. In the worst of cases, this density might be very hard to obtain.
(
i
(
iSometimes, in fact, the density actually used is
Xj

Xj
but
(
ithis doesn’t effect subsequent steps since the extra factor,
Xj
is not dependent on .
As an analogy, suppose we have a function of two variables. Consider
;
Y
) where
is a constant and is a random variable governed by some distribution .Then
q
(

)
=
E
[
h
(
;
Y
;
y
)
f
(
y
)
d
y is now a deterministic function that could be maximized if
Y
Y
desired.
The evaluation of this expectation is called the E step of the algorithm. Notice the meaning of
the two arguments in the function . The ﬁrst argument corresponds to the parameters
that ultimately will be optimized in an attempt to maximize the likelihood. The second argument
corresponds to the parameters that we use to evaluate the expectation.
The second step (the M step) of the EM algorithm is to maximize the expectation we computed
in the ﬁrst step. That is, we ﬁnd:
argmax
These two steps are repeated as necessary. Each iteration is guaranteed to increase the log
likelihood and the algorithm is guaranteed to converge to a local maximum of the likelihood func
tion. There are many rate of convergence papers (e.g., [ALR77, RW84, Wu83, JX96, XJ96]) but
we will not discuss them here.
Recall that . In the following discussion, we drop the subscripts from
different density functions since argument usage should should disambiguate different ones.
2
y
X
j
Y
)
x
j
y
(
f
)
y
(
h
=
]
x
=
X
j
)
Y
(
h
[
E
1
R

Q

Univers

Ebooks

Livres audio

Presse

Podcasts

BD

Documents

Romance

Romans et nouvelles

Scolaire

Polar

Jeunesse

Développement Personnel

Ressources professionnelles

SF

Partitions

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Ebooks

Jeunesse

Littérature

Ressources professionnelles

Santé et bien-être

Savoirs

Education

Loisirs et hobbies

Art, musique et cinéma

Actualité et débat de société

Voir tout

Jeunesse - Pour les 6 - 12 ans

Univers ado - Pour les plus de 12 ans

Eveil - De 0 à 6 ans

Découverte

Jeux et coloriages

Voir tout

Jeune Adulte

Etudes littéraires

Contes

Romans et nouvelles

Théâtre

Littérature régionale

SF et fantasy

Littérature sentimentale

Romans historiques

Classiques

Poésie

Récits de voyage

Témoignages et autobiographies

Romans policiers, polars, thrillers

Littérature érotique

Voir tout

Economie

Comptabilité

Fiscalité

Création d'entreprise

Marketing et communication

Efficacité professionnelle

Gestion et management

Emploi et carrières

Bourse et finance

Droit et juridique

Informatique

Voir tout

Esotérisme et paranormal

Alimentation et diététique

Forme et détente

Sexualité

Développement personnel

Beauté

Thérapies alternatives

Voir tout

Philosophie

Religions

Sciences humaines et sociales

Histoire

Medecine

Techniques

Sciences formelles

Science de la nature

Biographies

Géographie

Voir tout

Dictionnaires

Révisions

Ressources pédagogiques

Sciences de l’éducation

Manuels scolaires

Langues

Travaux de classe

Etudes supérieures

Maternelle et primaire

Fiches de lecture

Orientation scolaire

Méthodologie

Annales d’examens et concours

Voir tout

Voyages - guides

Bricolage et décoration

Animaux de compagnie

Humour

Sports

Jeux

Automobile

Cuisine et vins

Jardinage

Loisirs créatifs

Voir tout

Architecture et design

Musique

Cinéma

Photographie

Beaux-arts

Partitions de musique variée

Voir tout

Ecologie

Actualité, évènements

Essais

Politique

Débats et polémiques

Médias

Livres audio

Jeunesse

Littérature

Ressources professionnelles

Santé et bien-être

Savoirs

Education

Loisirs et hobbies

Art, musique et cinéma

Actualité et débat de société

Voir tout

Jeunesse - Pour les 6 - 12 ans

Univers ado - Pour les plus de 12 ans

Eveil - De 0 à 6 ans

Découverte

Voir tout

Jeune Adulte

Contes

Romans et nouvelles

Théâtre

SF et fantasy

Littérature sentimentale

Romans historiques

Classiques

Poésie

Récits de voyage

Témoignages et autobiographies

Romans policiers, polars, thrillers

Littérature érotique

Voir tout

Economie

Création d'entreprise

Marketing et communication

Efficacité professionnelle

Gestion et management

Emploi et carrières

Bourse et finance

Droit et juridique

Informatique

Voir tout

Esotérisme et paranormal

Alimentation et diététique

Forme et détente

Sexualité

Développement personnel

Beauté

Thérapies alternatives

Voir tout

Philosophie

Religions

Sciences humaines et sociales

Histoire

Medecine

Techniques

Sciences formelles

Science de la nature

Biographies

Voir tout

Ressources pédagogiques

Sciences de l’éducation

Langues

Etudes supérieures

Méthodologie

Voir tout

Voyages - guides

Bricolage et décoration

Animaux de compagnie

Humour

Sports

Jeux

Cuisine et vins

Jardinage

Loisirs créatifs

Voir tout

Architecture et design

Musique

Cinéma

Photographie

Beaux-arts

Voir tout

Actualité, évènements

Essais

Politique

Médias

Presse

Actualités

Lifestyle

Presse jeunesse

Presse professionnelle

Pratique

Presse sportive

Presse internationale

Culture & Médias

Voir tout

Hebdo

Magazines

Quotidiens

Voir tout

Déco

Cuisine

Mode de vie

Voyages et loisirs

Voir tout

Kids

Ado

Voir tout

Actualités éco

Presse spécialisée

Économies internationales

Voir tout

Féminin

Bien être

Famille

Consommation

Voir tout

Auto/Moto

Autres sports

Football

Sports hippiques

Voir tout

Tunisie

Maroc

RDC

Mali

Sénégal

Côte d'Ivoire

Cameroun

Burkina-Faso

UK

US

Voir tout

People & TV

Arts

Mode

Culture

Podcasts

Fictions

Développement personnel

Témoignages

Culture

Enfants

Enjeux de société

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

Voir tout

BD

BD Humoristique

Jeunesse

Action et Aventures

Science-fiction et Fantasy

Mangas

Société

Comics

BD adulte

Voir tout

Voir tout

Voir tout

Policiers & Thrillers

Aventure

Voir tout

Horreur

Fantastique

Medieval & Heroic Fantasy

Science-fiction

Voir tout

Voir tout

Biographies

Historique

Fiction

Documentaire

Voir tout

Voir tout

Documents

Jeunesse

Littérature

Ressources professionnelles

Santé et bien-être

Savoirs

Education

Loisirs et hobbies

Art, musique et cinéma

Actualité et débat de société

Voir tout

Jeunesse - Pour les 6 - 12 ans

Univers ado - Pour les plus de 12 ans

Eveil - De 0 à 6 ans

Découverte

Jeux et coloriages

Voir tout

Romans et nouvelles

Théâtre

SF et fantasy

Littérature sentimentale

Romans historiques

Classiques

Poésie

Récits de voyage

Témoignages et autobiographies

Romans policiers, polars, thrillers

Littérature érotique

Voir tout

Comptabilité

Fiscalité

Création d'entreprise

Marketing et communication

Efficacité professionnelle

Analyses et études sectorielles

Gestion et management

Emploi et carrières

Bourse et finance

Droit et juridique

Informatique

Voir tout

Alimentation et diététique

Forme et détente

Sexualité

Développement personnel

Beauté

Thérapies alternatives

Voir tout

Philosophie

Religions

Sciences humaines et sociales

Histoire

Medecine

Techniques

Sciences formelles

Science de la nature

Biographies

Géographie

Voir tout

Cours

Révisions

Ressources pédagogiques

Sciences de l’éducation

Manuels scolaires

Langues

Travaux de classe

Annales de BEP

Etudes supérieures

Maternelle et primaire

Fiches de lecture

Orientation scolaire

Méthodologie

Corrigés de devoir

Annales d’examens et concours

Annales du bac

Annales du brevet

Rapports de stage

Voir tout

Voyages - guides

Bricolage et décoration

Animaux de compagnie

Humour

Sports

Jeux

Généalogie

Automobile

Cuisine et vins

Jardinage

Loisirs créatifs

Voir tout

Architecture et design

Musique

Cinéma

Photographie

Beaux-arts

Partitions de musique romantique

Partitions de musique baroque

Partitions de musique classique

Partitions de musique de la renaissance

Partitions de musique variée

Partitions de musique moderne

Partitions du début des années vingt

Voir tout

Actualité, évènements

Essais

Politique

Débats et polémiques

Médias

Signaler un problème

YouScribe

Qui sommes-nous ?

L'application mobile

Questions fréquentes

La presse en parle

Livre Blanc 2024

Nous contacter

Le catalogue

Ebooks

Livres audio

Presse

Podcasts

BD

Documents

Scolaire

Thématiques

Le service

Découvrir les offres

Publier vos documents

Offres partenaires

Offres éditeurs

Vous avez un code privilège ?

Les conditions

Respect du droit d'auteur

Conditions générales d'utilisation

Conditions générales de vente

Charte de données personnelles

Mentions légales

Confidentialité

© 2010-2024 YouScribe

Livre audio en ligne - Développement personnel Livre en ligne Tout le catalogue Tous les Intérêts