87 pages

English

Machine Learning Reductions Tutorial

Urku - Alina Beygelzimer (Ibm Research) .1in John Langford (Yahoo! Research) .1in Bianca Zadrozny (Fluminense Federal University , Brazil)

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

87 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

Machine Learning ReductionsTutorialAlina Beygelzimer (IBM Research)John Langford (Yahoo! Research)Bianca Zadrozny (Fluminense Federal University, Brazil)July 13, 2009Applying Machine Learning in PracticeThe problem you want to solve is not the problem solved bystandard learning algorithms / toolboxes.Applications gave rise to many di erent types of machinelearning problems:Cost-sensitive classi cationHierarchical classi cationStructuredMachine translationRanking...Applying Machine Learning in PracticeApproaches:1 Design new algorithms (or modify existing ones).2 Think about how to reuse old ones.This tutorial is about the second approach.6Core ProblemBinary Classi cationGiven training dataf(x ;y );:::; (x ;y )g, produce a1 1 n nclassi er h : X!f0; 1g.Unknown underlying distribution D over Xf0; 1g.Find h with small 0-1 loss:‘ (h;D) = Pr [h(x) = y]0=1 (x;y)DThere have been years of research and development of algorithmsfor solving this problem. It would be nice to be able to reuse thisfor other problems.It should be robust. Good 0/1 performance should imply goodperformance on the problem we care about.It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be modular. Able to plug in any classi er and reusecreated reductions similarly.What do we want from a reduction?It should be robust. Good 0/1 performance should imply goodperformance on the problem we ...

Informations

Publié par	Urku
Nombre de lectures	29
Langue	English

Extrait

Machine Learning Reductions Tutorial

Alina Beygelzimer (IBM Research)

John Langford (Yahoo! Research)

Bianca Zadrozny (Fluminense Federal University, Brazil)

July 13, 2009

Applying Machine Learning in Practice

The problem you want to solve is not the problem solved by standard learning algorithms / toolboxes.

Applications gave rise to many diﬀerent types of machine learning problems: Cost-sensitive classiﬁcation Hierarchical classiﬁcation Structured classiﬁcation Machine translation Ranking ...

Applying Machine Learning in Practice

Approaches:

Design new algorithms (or modify existing ones).

Think about how to reuse old ones.

This tutorial is about the second approach.

Core Problem

Binary Classiﬁcation

Given training data{(x1,y1), . . . ,(xn,yn)}, produce a classiﬁerh:X→ {0,1}.

Unknown underlying distributionDoverX× {0,1}.

Findhwith small0-1 loss:

`0/1(h,D)=Pr(x,y)∼D[h(x

)6=y]

There have been years of research and development of algorithms for solving this problem. It would be nice to be able to reuse this for other problems.

What do we want from a reduction?

What do we want from a reduction?

It should berobust. Good 0/1 performance should imply good performance on the problem we care about.

What do we want from a reduction?

It should berobust. Good 0/1 performance should imply good performance on the problem we care about.

It should bemodular. Able to plug in any classiﬁerandreuse created reductions similarly.

Simple Variation on the Core Problem

Importance-WeightedClassiﬁcation

Given training data{(x1,y1 a classiﬁerh:X→ {0,1}.

,c1), . . . ,(xn,yn,cn)}, produce

Unknown underlying distributionDover X× {0,1}×[0,∞).

Findhwith smallexpected cost:

`(h,D)=E(x,y,c)∼D[c∙1(h(x)6=y)]

Common scenario: one class is rare but the cost of not recognizing examples of this class is generally high. Can’t apply binary classiﬁcation algorithms directly.