Learning under differing training and test distributions [Elektronische Ressource] / von Steffen Bickel

universitat_potsdam - Bickel , Hans Steffen

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

110 pages

English

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

A propos
Informations
Extrait

Description

Institut fu¨r InformatikArbeitsgruppe Maschinelles LernenLearning under Diﬀering Trainingand Test DistributionsDissertationzur Erlangung des akademischen Grades“doctor rerum naturalium”(Dr. rer. nat.)in der Wissenschaftsdisziplin Informatikeingereicht an derMathematisch-Naturwissenschaftlichen Fakult¨atder Universit¨at PotsdamvonSteﬀen BickelPotsdam, den 22.07.2009 Published online at the Institutional Repository of the University of Potsdam: URL http://opus.kobv.de/ubp/volltexte/2009/3333/ URN urn:nbn:de:kobv:517-opus-33331 [http://nbn-resolving.org/urn:nbn:de:kobv:517-opus-33331] AbstractOneofthemainproblemsinmachinelearningistotrainapredictivemodelfromtrain-ing data and to make predictions on test data. Most predictive models are constructedundertheassumptionthatthetrainingdataisgovernedbytheexactsamedistributionwhich the model will later be exposed to. In practice, control over the data collectionprocess is often imperfect. A typical scenario is when labels are collected by question-naires and one does not have access to the test population. For example, parts of thetest population are underrepresented in the survey, out of reach, or do not return thequestionnaire. In many applications training data from the test distribution are scarcebecause they are diﬃcult to obtain or very expensive.

Sujets