//img.uscri.be/pth/d81ead6b6b163b8a5e90b0356a9f5d53fbade6de
Cet ouvrage fait partie de la bibliothèque YouScribe
Obtenez un accès à la bibliothèque pour le lire en ligne
En savoir plus

Sparse methods for machine learning

De
57 pages
Sparse methods for machine learning Francis Bach Willow project, INRIA - Ecole Normale Superieure CVPR Tutorial - June 2010 Special thanks to R. Jenatton, G. Obozinski

  • norms

  • supervised learning

  • multi-task learning

  • sparsity- inducing norms

  • structured sparse

  • ?1 -norm ?w?1

  • sparse methods

  • norm


Voir plus Voir moins
Sparse methods for machine learning
Francis Bach
Willowproject,INRIA-EcoleNormaleSup´erieure
CVPR Tutorial - June 2010 Special thanks to R. Jenatton, G. Obozinski
Sparse methods for machine learning Outline
Sparse linear estimation with the1-norm
Lasso Important theoretical results
Structured sparse methods on vectors
Groups of features / Multiple kernel learning
Sparse methods on matrices
Multi-task learning Matrix factorization (low-rank, sparse PCA, dictionary learning)
Supervised learning and regularization
Data:xi∈ X,yi∈ Y,i= 1     n
Minimize with respect to functionf:X → Y:
n X(yi f(xi)) i=1 Error on data
Loss & function space ?
Two theoretical/algorithmic issues:
1. Loss 2.Function space / norm
+ +
λ2kfk2 Regularization
Norm ?
Regularizations
 overfitting avoidMain goal:
Two main lines of work:
1.
EuclideanandHilbertiannorms (i.e.,2-norms) Possibility of non linear predictors Non parametric supervised learning and kernel methods Well developped theory and algorithms (see, e.g., Wahba, 1990; Sch¨olkopfandSmola,2001;Shawe-TaylorandCristianini,2004)
Regularizations
Main goal: avoid overfitting
Two main lines of work:
1.
EuclideanandHilbertiannorms (i.e.,2-norms) Possibility of non linear predictors Non parametric supervised learning and kernel methods Well developped theory and algorithms (see, e.g., Wahba, 1990; Scho¨lkopfandSmola,2001;Shawe-TaylorandCristianini,2004) 2.ngciti-yniudSapsrnorms restricted to linear predictors on vectorsUsually f(x) =wx Main example:1-normkwk1=Ppi=1|wi| Perform model selection as well as regularization Theory and algorithms “in the making”