Efficient Optimization for Discriminative Latent Class Models
9 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Efficient Optimization for Discriminative Latent Class Models

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
9 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Niveau: Supérieur, Doctorat, Bac+8
Efficient Optimization for Discriminative Latent Class Models Armand Joulin? INRIA 23, avenue d'Italie, 75214 Paris, France. Francis Bach? INRIA 23, avenue d'Italie, 75214 Paris, France. Jean Ponce? Ecole Normale Superieure 45, rue d'Ulm 75005 Paris, France. Abstract Dimensionality reduction is commonly used in the setting of multi-label super- vised classification to control the learning capacity and to provide a meaningful representation of the data. We introduce a simple forward probabilistic model which is a multinomial extension of reduced rank regression, and show that this model provides a probabilistic interpretation of discriminative clustering meth- ods with added benefits in terms of number of hyperparameters and optimization. While the expectation-maximization (EM) algorithm is commonly used to learn these probabilistic models, it usually leads to local maxima because it relies on a non-convex cost function. To avoid this problem, we introduce a local approx- imation of this cost function, which in turn leads to a quadratic non-convex op- timization problem over a product of simplices. In order to maximize quadratic functions, we propose an efficient algorithm based on convex relaxations and low- rank representations of the data, capable of handling large-scale problems.

  • discriminative clustering

  • latent representation

  • convex optimization

  • problem

  • input variable

  • em algorithm

  • rather than

  • tradi- tional convex

  • reduced-rank regression


Sujets

Informations

Publié par
Nombre de lectures 46
Langue English

Extrait

1
Efficient Optimization for Discriminative Latent Class Models
Armand Joulin INRIA 23, avenue d’Italie, 75214 Paris, France. armand.joulin@inria.fr
Francis Bach INRIA 23, avenue d’Italie, 75214 Paris, France. francis.bach@inria.fr
Abstract
Jean Ponce Ecole Normale Supe´rieure 45, rue d’Ulm 75005 Paris, France. jean.ponce@ens.fr
Dimensionality reduction is commonly used in the setting of multilabel super vised classification to control the learning capacity and to provide a meaningful representation of the data. We introduce a simple forward probabilistic model which is a multinomial extension of reduced rank regression, and show that this model provides a probabilistic interpretation of discriminative clustering meth ods with added benefits in terms of number of hyperparameters and optimization. While the expectationmaximization (EM) algorithm is commonly used to learn these probabilistic models, it usually leads to local maxima because it relies on a nonconvex cost function. To avoid this problem, we introduce a local approx imation of this cost function, which in turn leads to a quadratic nonconvex op timization problem over a product of simplices. In order to maximize quadratic functions, we propose an efficient algorithm based on convex relaxations and low rank representations of the data, capable of handling largescale problems. Exper iments on text document classification show that the new model outperforms other supervised dimensionality reduction methods, while simulations on unsupervised clustering show that our probabilistic formulation has better properties than exist ing discriminative clustering methods.
Introduction
Latent representations of data are widespread tools in supervised and unsupervised learning. They are used to reduce the dimensionality of the data for two main reasons: on the one hand, they provide numerically efficient representations of the data; on the other hand, they may lead to better predictive performance. In supervised learning, latent models are often used in a generative way, e.g., through mixture models on the input variables only, which may not lead to increased predictive performance. This has led to numerous works on supervised dimension reduction (e.g., [1, 2]), where the final discriminative goal of prediction is taken explicitly into account during the learning process. In this context, various probabilistic models have been proposed, such as mixtures of experts [3] or discriminative restricted Boltzmann machines [4], where a layer of hidden variables is used between the inputs and the outputs of the supervised learning model. Parameters are usually estimated by expectationmaximization (EM), a method that is computationally efficient but whose cost function may have many local maxima in high dimensions. In this paper, we consider a simplediscriminative latent class(DLC) model where inputs and outputs are independent given the latent representation.We make the following contributions:
WILLOWprojectteam,LaboratoiredInformatiquedelEcoleNormaleSup´erieure,(ENS/INRIA/CNRS UMR 8548).
1
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents