Machine Learning ReductionsTutorialAlina Beygelzimer (IBM Research)John Langford (Yahoo! Research)Bianca Zadrozny (Fluminense Federal University, Brazil)July 13, 2009Applying Machine Learning in PracticeThe problem you want to solve is not the problem solved bystandard learning algorithms / toolboxes.Applications gave rise to many di erent types of machinelearning problems:Cost-sensitive classi cationHierarchical classi cationStructuredMachine translationRanking...Applying Machine Learning in PracticeApproaches:1 Design new algorithms (or modify existing ones).2 Think about how to reuse old ones.This tutorial is about the second approach.6Core ProblemBinary Classi cationGiven training dataf(x ;y );:::; (x ;y )g, produce a1 1 n nclassi er h : X!f0; 1g.Unknown underlying distribution D over Xf0; 1g.Find h with small 0-1 loss:‘ (h;D) = Pr [h(x) = y]0=1 (x;y)DThere have been years of research and development of algorithmsfor solving this problem. It would be nice to be able to reuse thisfor other problems.It should be robust. Good 0/1 performance should imply goodperformance on the problem we care about.It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be robust. Good 0/1 performance should imply goodperformance on the problem we ...
Common scenario: one class is rare but the cost of not recognizing examples of this class is generally high. Can’t apply binary classification algorithms directly.
Other Approaches
(1) Make particular classifier learners cost-sensitive
Need to know the algorithm + implementation
For a number of algorithms, it’s not so easy to perform the conversion correctly
(2) Bayes risk minimization
R(y=i|x) =D(y= 1−i|x)C(x)
fori∈ {0,1},
whereC(x) is the expected cost of misclassifyingx. Requires estimating class membership probabilitiesD(y|x) and example-dependent costs.
Many classifiers yield probability estimates but these are often very poor.