Parametric Link Models for Knowledge Transfer in Statistical Learning

profil-urra-2012 - Julien Jacques

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

40 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

Parametric Link Models for Knowledge Transfer in Statistical Learning 1 Chapter 1 PARAMETRIC LINK MODELS FOR KNOWLEDGE TRANSFER IN STATISTICAL LEARNING Beninel F.1, Biernacki C.2, Bouveyron C.3, Jacques J.?2 and Lourme A.4 1CREST-ENSAI, Bruz, France 2Université Lille 1 & CNRS & INRIA, Lille, France 3Université Paris 1 Panthéon-Sorbonne, Paris, France 4Université de Pau et des Pays de l'Adour, Pau, France Abstract When a statistical model is designed in a prediction purpose, a major assumption is the absence of evolution in the modeled phenomenon between the training and the prediction stages. Thus, training and future data must be in the same feature space and must have the same distribution. Unfortunately, this assumption turns out to be of- ten false in real-world applications. For instance, biological motivations could lead to classify individuals from a given species when only individuals from another species are available for training. In regression, we would sometimes use a predictive model for data having not exactly the same distribution that the training data used for esti- mating the model. This chapter presents techniques for transfering a statistical model estimated from a source population to a target population. Three tasks of statistical learning are considered: Probabilistic classification (parametric and semi-parametric), linear regression (including mixture of regressions) and model-based clustering (Gaus- sian and Student).

transfer learning

quantitative space

numerous techniques related

examples e2

parametric link

learning techniques

related sections below

since parametric

Sujets

Transfer

Jacques

France 3

Université de Pau et des Pays de l'Adour

Inductive transfer

Informations

Publié par	profil-urra-2012
Nombre de lectures	54
Langue	English

Extrait

ParametricLinkModelsforKnowledgeTransferinStatisticalLearning1

Chapter1

P
ARAMETRICLINKMODELSFORKNOWLEDGE
TRANSFERINSTATISTICALLEARNING
BeninelF.
1
,BiernackiC.
2
,BouveyronC.
3
,JacquesJ.
∗
2
andLourmeA.
4
1
CREST-ENSAI,Bruz,France
2
UniversitéLille1&CNRS&INRIA,Lille,France
3
UniversitéParis1Panthéon-Sorbonne,Paris,France
4
UniversitédePauetdesPaysdel'Adour,Pau,France

Abstract
Whenastatisticalmodelisdesignedinapredictionpurpose,amajorassumption
istheabsenceofevolutioninthemodeledphenomenonbetweenthetrainingandthe
predictionstages.Thus,trainingandfuturedatamustbeinthesamefeaturespaceand
musthavethesamedistribution.Unfortunately,thisassumptionturnsouttobeof-
tenfalseinreal-worldapplications.Forinstance,biologicalmotivationscouldleadto
classifyindividualsfromagivenspecieswhenonlyindividualsfromanotherspecies
areavailablefortraining.Inregression,wewouldsometimesuseapredictivemodel
fordatahavingnotexactlythesamedistributionthatthetrainingdatausedforesti-
matingthemodel.Thischapterpresentstechniquesfortransferingastatisticalmodel
estimatedfroma
source
populationtoa
target
population.Threetasksofstatistical
learningareconsidered:Probabilisticclassication(parametricandsemi-parametric),
linearregression(includingmixtureofregressions)andmodel-basedclustering(Gaus-
sianandStudent).Ineachsituation,theknowledgetransferiscarriedoutbyintroduc-
ingparametriclinksbetweenbothpopulations.Theuseofsuchtransfertechniques
wouldimprovetheperformanceoflearningbyavoidingmuchexpensivedatalabeling
efforts.
KeyWords
:Adaptiveestimation,linkbetweenpopulations,transferlearning,classi-
cation,regression,clustering,EMalgorithm,applications.
A
MSSubjectClassication:
62H30,62J99.
∗
E-mailaddress:julien.jacques@polytech-lille.fr

Beninel
etal.

21.Introduction
Statisticallearning[18]isakeytoolformanyscienceandapplicationareassinceitallows
toexplainandtopredictdiversephenomenafromtheobservationofrelateddata.Itleadsto
awidevarietyofmethods,dependingontheparticularproblemathand.Examplesofsuch
problemsarenumerous:
•
Examples
E
1
:In
CreditScoring
,predictthebehaviorofborrowerstopaybackloan,
onthebasisofinformationknownaboutthesecustomers;In
Medicine
,predictthe
riskoflungcancerrecurrenceforapatienttreatedforarstcancer,onthebasis
ofthetypeoftreatmentusedfortherstcancerandonclinicalanddemographic
measurementsforthatpatient.
•
Examples
E
2
:In
Economics
,predictthehousingpriceonthebasisofseveralhous-
ingdescriptivevariables;In
Finance
,predicttheprotabilityofanancialassetsix
monthsafterpurchase.
•
Examples
E
3
:In
Marketing
,createcustomersgroupsaccordingtotheirpurchasehis-
toryinordertotargetamarketingcampaign;In
Biology
,identifygroupsinasample
ofbirdsdescribedbysomebiometricfeatureswhichnallyrevealthepresenceof
differentgenders.
Inatypicalstatisticallearningproblem,aresponsevariable
y
∈
Y
hastobepredicted
fromasetof
d
featurevariables(orcovariates)
x
=(
x
1
,...,
x
d
)
∈
X
.Spaces
X
and
Y
areusuallyquantitativeorcategorical.Itisalsopossibletohaveheterogeneityinfeatures
variables(bothquantitativeandcategoricalforinstance).Theanalysisalwaysreliesona
trainingdataset
S
=(
x
,
y
)
,inwhichtheresponseandfeaturevariablesareobservedfora
setof
n
individualswhicharerespectivelydenotedby
x
=(
x
1
,...,
x
n
)
and
y
=(
y
1
,...,
y
n
)
.
Using
S
,apredictivemodelisbuiltinordertopredicttheresponsevariableforanewindi-
vidual,forwhichthecovariates
x
areobservedbutnottheresponse
y
.Thistypicalsituation
iscalled
supervised
learning.Inparticular,if
Y
isacategoricalspace,itcorrespondstoa
discriminantanalysis
situation;ItaimstosolveproblemswhichlooklikeExamples
E
1
.If
Y
isaquantitativespace,itcorrespondstoa
regression
situationandaimstosolveproblems
similartoExamples
E
2
.Notealsothatif
y
isonlypartiallyknownin
S
,itexhibitswhatis
called
semi-supervised
learning.
Anothertypicalstatisticallearningproblemconsistsinpredictingthewholeresponses
y
whilehavingneverobservethem.Inthiscaseonlythefeaturevariablesareknown,
thus
S
=
x
,anditcorrespondstoan
unsupervised
learningsituation.If
Y
isrestricted
toacategoricalspace(themostfrequentcase),itconsistsina
clustering
purpose,related
problemsbeingillustratedbyExamples
E
3
.
Inthischapter,wefocusonstatisticalmodelingforsolvingaswellsupervisedand
unsupervisedlearning.Manyclassicalprobabilisticmethodsexistandwewillgiveuseful
references,whennecessary,throughoutthechapter.Thus,thereaderinterestedforsuch
referencesisinvitedtohavealookinrelatedsectionsbelow.
Amainassumptioninsupervisedlearningistheabsenceofevolutioninthemodeled
phenomenonbetweenthetrainingofthemodelandthepredictionoftheresponseforanew

ParametricLinkModelsforKnowledgeTransferinStatisticalLearning3
individual.Moreprecisely,thenewindividualisassumedtoarisefromthesamestatistical
populationthanthetrainingone.Inunsupervisedlearning,itisalsoimplicitlyassumedthat
allindividualsarisefromthesamepopulation.Unfortunately,suchclassicalhypotheses
maynotholdinmanyrealisticsituationsasreectedbyrevisitedExamples
E
1
to
E
3
:
•
Examples
E
1
∗
:In
CreditScoring
,thestatisticalscoringmodelhasbeentrainedona
datasetofcustomersbutisusedtopredictbehaviorofnon-customers;In
Medicine
,
theriskoflungcancerrecurrenceislearnedforanEuropeanpatientbutwillbeap-
pliedtoanAsianpatient.
•
Examples
E
2
∗
:In
Economics
,areal-estateagencyimplantedforalongtimeonthe
USEastCoastaimstoconquernewmarketsbyopeningseveralagenciesontheWest
Coastbutbothmarketsarequitedifferent;In
Finance
,expertiseinnancialassetof
thepastyearissurelydifferentfromthecurrentone.
•
Examples
E
3
∗
:In
Marketing
,customerstobeclassiedcorrespondinfacttoapooled
panelofnewandoldercustomers;In
Biology
,differentsubpeciesofbirdsarepooled
togetherandmayconsequentlyhavehighlydifferentfeaturesforthesamegender.
Inthesupervisedsetting,thequestionis
Q
1
:Isitnecessarytorecollectnewtraining
dataandtobuildanewstatisticallearningmodelorcantheprevioustrainingdatastillbe
useful?Intheunsupervisedsetting,thequestionis
Q
2
:Isitbettertoperformaunique
clusteringonthewholedatasetortoperformseveralindependantclusteringsonsome
identiedsubsets?.
Question
Q
1
isaddressedas
transfer
learningandageneraloverviewisgivenin[31].
Transferlearningtechniquesaimtotransfertheknowledgelearnedonasourcepopulation
W
toatargetpopulation
W
∗
,inwhichthisknowledgewillbeusedinapredictionpurpose.
Thesetechniquesaredividedintotwoimportantsituations:Thetransferofamodel
does
need
or
doesnotneed
toobservesomeresponsevariablesinthetargetdomain.Therstcase
isquotedas
inductivetransfer
learningwhereasthesecondoneisquotedas
transductive
transfer
learning.Usually,theclassicationpurposeasdescribedinExamples
E
1
∗
canbe
solvedbyeithertransductiveorinductivetransferlearning,thischoicedependingonthe
modelathand(generativeorpredictivemodels).Contrariwise,theregressionpurposeas
describedinExamples
E
2
∗
canbeonlysolvedbyinductivetransferlearningsinceonly
predictivemodelsareinvolved.Question
Q
2
isadressedas
unsupervisedtransfer
learning.
Itcorrespondstosimultaneousclusteringofseveralsamplesand,thus,itconcernsExamples
∗.E3Acommonexpectedadvantageofallthesetransferlearningtechniquesisarealpre-
dictivebenetsinceknowledgelearnedonthesourcepopulationisusedinadditiontothe
availableinformationonthetargetpopulation.However,thecommonchallengeistoestab-
lishatransferfunctionbetweenthesourceandthetargeptopulations.Inthischapter,we
focusonparametricstatisticalmodels.Besidesbeinggoodcompetitorstononparametric
modelsintermsofprediction,thesemodelshavetheadvantageofbeingeasilyinterpreted
bypractitioners.Sinceparametricmodelswillbeused,itwillbenaturaltomodelizethe
transferfunctionbysomeparametriclinks.Thus,inadditiontoapredictivebenet,the
interpretabilityofthelinkparameterswillgivetopractitionersusefulinformationonthe
evolutionandthedifferencesbetweenthesourceandtargetpopulations.

4Beninel
etal.
Thischapterisorganizedasfollows.Section2.presentstransferlearningfordifferent
discriminantanalysiscontexts:Gaussianmodel(continuouscovariates),Bernoullimodel
(binarycovariates)andlogisticmodel(continuousorbinarycovariates).Section3.consid-
ersthetransferofregressionmodelsforaquantitativeresponsevariableintwosituations:
Usualregressionandmixtureofregressions.Finally,Section4.proposesmodelstoclus-
tersimultaneouslyasourceandatargetpopulationintwosituationsagain:Mixturesof
GaussianandStu