La lecture en ligne est gratuite
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
Télécharger Lire

Machine Learning Reductions Tutorial

De
87 pages
Machine Learning ReductionsTutorialAlina Beygelzimer (IBM Research)John Langford (Yahoo! Research)Bianca Zadrozny (Fluminense Federal University, Brazil)July 13, 2009Applying Machine Learning in PracticeThe problem you want to solve is not the problem solved bystandard learning algorithms / toolboxes.Applications gave rise to many di erent types of machinelearning problems:Cost-sensitive classi cationHierarchical classi cationStructuredMachine translationRanking...Applying Machine Learning in PracticeApproaches:1 Design new algorithms (or modify existing ones).2 Think about how to reuse old ones.This tutorial is about the second approach.6Core ProblemBinary Classi cationGiven training dataf(x ;y );:::; (x ;y )g, produce a1 1 n nclassi er h : X!f0; 1g.Unknown underlying distribution D over Xf0; 1g.Find h with small 0-1 loss:‘ (h;D) = Pr [h(x) = y]0=1 (x;y)DThere have been years of research and development of algorithmsfor solving this problem. It would be nice to be able to reuse thisfor other problems.It should be robust. Good 0/1 performance should imply goodperformance on the problem we care about.It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be robust. Good 0/1 performance should imply goodperformance on the problem we ...
Voir plus Voir moins
Machine Learning Reductions Tutorial
Alina Beygelzimer (IBM Research)
John Langford (Yahoo! Research)
Bianca Zadrozny (Fluminense Federal University, Brazil)
July 13, 2009
Applying Machine Learning in Practice
The problem you want to solve is not the problem solved by standard learning algorithms / toolboxes.
Applications gave rise to many different types of machine learning problems: Cost-sensitive classification Hierarchical classification Structured classification Machine translation Ranking ...
Applying Machine Learning in Practice
Approaches:
1
2
Design new algorithms (or modify existing ones).
Think about how to reuse old ones.
This tutorial is about the second approach.
Core Problem
Binary Classification
Given training data{(x1,y1), . . . ,(xn,yn)}, produce a classifierh:X→ {0,1}.
Unknown underlying distributionDoverX× {0,1}.
Findhwith small0-1 loss:
`0/1(h,D)=Pr(x,y)D[h(x
)6=y]
There have been years of research and development of algorithms for solving this problem. It would be nice to be able to reuse this for other problems.
What do we want from a reduction?
What do we want from a reduction?
It should berobust. Good 0/1 performance should imply good performance on the problem we care about.
What do we want from a reduction?
It should berobust. Good 0/1 performance should imply good performance on the problem we care about.
It should bemodular. Able to plug in any classifierandreuse created reductions similarly.
Simple Variation on the Core Problem
Importance-WeightedClassification
Given training data{(x1,y1 a classifierh:X→ {0,1}.
,c1), . . . ,(xn,yn,cn)}, produce
Unknown underlying distributionDover X× {0,1}×[0,).
Findhwith smallexpected cost:
`(h,D)=E(x,y,c)D[c1(h(x)6=y)]
Common scenario: one class is rare but the cost of not recognizing examples of this class is generally high. Can’t apply binary classification algorithms directly.
Other Approaches
(1) Make particular classifier learners cost-sensitive
Need to know the algorithm + implementation
For a number of algorithms, it’s not so easy to perform the conversion correctly
(2) Bayes risk minimization
R(y=i|x) =D(y= 1i|x)C(x)
fori∈ {0,1},
whereC(x) is the expected cost of misclassifyingx. Requires estimating class membership probabilitiesD(y|x) and example-dependent costs.
Many classifiers yield probability estimates but these are often very poor.
Reductions
Goal: minimize`onD
Reductions
Goal: minimize`onD
Algorithm for optimizing`0/1