A statistical approach to rule learning [Elektronische Ressource] / Ulrich Rückert

218 pages

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

A statistical approach to rule learning [Elektronische Ressource] / Ulrich Rückert

technische_universitat_munchen

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

218 pages

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Description

A Statistical Approach to Rule LearningUlrich Ruc kertInstitut fur Informatik, Lehrstuhl fur BioinformatikA Statistical Approach to RuleLearningUlrich RuckertVollst andiger Abdruck der von der Fakult at fur Informatik der TechnischenUniversit at Munc hen zur Erlangung des akademischen Grades einesDoktors der Naturwissenschaftengenehmigten Dissertation.Vorsitzender: Univ.-Prof. Dr. H. J. SchmidhuberPrufer der Dissertation:1. Univ.-Prof. Dr. St. Kramer2. Univ.-Prof. Dr. J. Furnkranz,Technische Universit at DarmstadtDie Dissertation wurde am 30.8.2007 bei der Technischen Universit at Munc heneingereicht und durch die Fakult at fur Informatik am 20.2.2008 angenommen.Contents1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 A Typical Learning Scenario . . . . . . . . . . . . . . 11.1.2 The Trade-O Between Comprehensibility and Pre-dictive Accuracy . . . . . . . . . . . . . . . . . . . . . 21.1.3 Statistical Analysis of Rule Learning . . . . . . . . . . 41.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 62 Rule Learning 92.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 Rule Set Representations . . . . . . . . . . . . . . . . . . . . 102.2.1 DNF Rule Sets . . . . . . . . . . . . . . . . . . . . . . 102.2.2 Decision Lists . . . . . . . . . . . . . . . . . . . . . . . 112.2.

Sujets

Informatik

Informations

Publié par	technische_universitat_munchen
Publié le	01 janvier 2008
Nombre de lectures	44
Poids de l'ouvrage	1 Mo

Extrait

Statistical

hApproac

Ulrich

ertkuc¨R

elRu

Learning

Institutf¨urInformatik,Lehrstuhlf¨urBioinformatik

AStatisticalApproachtoRule
Learning

¨tuckerRUlrich

Vollst¨andigerAbdruckdervonderFakult¨atf¨urInformatikderTechnischen
Universit¨atM¨unchenzurErlangungdesakademischenGradeseines

haftenNaturwissenscderDoktors

Dissertation.genehmigten

Vorsitzender:Univ.-Prof.Dr.H.J.Schmidhuber

Pr¨uferderDissertation:

KramerSt.Dr.Univ.-Prof.1.

2.TecUniv.-Prof.hnischeUniDr.vJ.ersit¨F¨aturnkranz,Darmstadt

DieeingereichtDissertationunddurcwurdehdieamFakult¨30.8.2007atf¨urbeiderInformatikTechniscamhen20.2.2008Universit¨atangenommen.M¨unchen

tstenCon

ductiontroIn11.1Motivation............................
1.1.1ATypicalLearningScenario..............
1.1.2TheTrade-OﬀBetweenComprehensibilityandPre-
dictiveAccuracy.....................
1.1.3StatisticalAnalysisofRuleLearning..........
1.2Applications............................
1.3OutlineoftheThesis.......................

LearningRule22.1Introduction............................
2.2RuleSetRepresentations....................
2.2.1DNFRuleSets......................
2.2.2DecisionLists.......................
2.2.3WeightedRuleSets....................
2.3RuleConditionRepresentations.................
2.4ChallengesinRuleLearning...................
2.4.1CombinatorialComplexity................
2.4.2OverﬁttingAvoidance..................
2.5AlgorithmicApproachestoRuleLearning...........
2.5.1Separate-and-Conquer..................
2.5.2OtherApproaches....................

LearninghineMacStatistical33.1Introduction............................
3.2AStatisticalFrameworkforClassiﬁcation...........
3.3BasicResults...........................
3.3.1TheTestSetBound...................
3.3.2BayesClassiﬁerandConsistency............
3.3.3NoFreeLunch......................
3.3.4ApproximationandEstimationError.........
3.4CapacityControl.........................
3.4.1EmpiricalRiskMinimization..............

1112456

991010111213151518202022

27272931323335384141

3.4.2StructuralRiskMinimization..............43
3.4.3Regularization......................45
3.5EnsembleTheory.........................47
3.5.1Bagging..........................47
3.5.2Boosting..........................49
3.6BayesianLearning........................52
57SetsRuleDNF44.1Trade-OﬀsinRuleLearning...................57
4.2EmpiricalRiskMinimizationforDNFRuleSets.......59
4.2.1ARandomizedAlgorithmfork-termDNFConsistency61
4.2.2StochasticLocalSearchfork-termDNFMaximum
Agreement........................68
4.3StructuralRiskMinimizationforDNFRuleSets.......73
24.3.1TheSLRuleLearner..................74
4.3.2Experiments.......................76
4.4EnsembleBasedCapacityControl...............83
4.4.1TheLearningSetting..................84
4.4.2LearningEnsemblesofRuleSets............85
4.4.3BoundsforRuleLearning................86
4.4.4Experiments.......................91
4.5SummaryandRelatedWork...................95
5WeightedRuleSets99
5.1Motivation............................99
5.2EmpiricalRiskMinimizationforWeightedRuleSets.....101
5.2.1Soft-MarginLoss.....................102
5.2.2EmpiricalMargin.....................103
5.2.3MarginMinusVariance.................105
5.3StructuralRiskMinimizationforWeightedRuleSets.....108
5.3.1RademacherBoundsforMMV.............109
5.3.2PAC-BayesianBoundsforMMV............115
5.3.3RepositorySizeBounds.................119
5.4TheRumbleSystem.......................125
5.5Experiments............................128
5.6SummaryandRelatedWork...................132
135LearningRuleFirst-Order66.1Motivation............................135
6.2ExtendingRumbletoFirst-OrderData............138
6.3AFrameworkforRelationalLearning.............141
6.3.1High-LevelView.....................142
6.3.2Low-LevelView......................146
6.3.3InstantiationsoftheFramework............148

6.4Dispersion-BasedRuleGenerationforStructuredData....150
6.4.1FindingExpressiveRuleSets..............151
6.4.2StochasticLocalSearchforOptimalDispersionFeatures155
6.5Experiments............................162
6.5.1ExperimentswithRumbleonFirst-OrderData...162
6.5.2AnEmpiricalInvestigationofRuleLearninginthe
Framework........................164
6.5.3ExperimentsWithDispersion-BasedRuleGeneration169
6.6SummaryandRelatedWork...................171

7SummaryandOutlook175
7.1Summary.............................175
7.2Outlook..............................177

yBibliograph

Index

iii

179

198

golden

rule

that

there

are

golden

rules.

georGe

dBernar

Shaw

Abstract

Thisdissertationinvestigatesrulelearningasastatisticalclassiﬁcationprob-
lem.Rulelearninghasalonghistoryinmachinelearningandisknownfor
inducinginterpretableandcomprehensibleclassiﬁers.Methodsfromsta-
tisticalmachinelearning,ontheotherhand,havetraditionallyfocusedon
predictiveaccuracy,oftenattheexpenseofinterpretability.Thegoalof
thisworkistocombinetheapproachestoobtainhighlypredictive,yetin-
terpretableclassiﬁers.Unlikemanyoftheheuristicsappliedintraditional
tionruleoflearning,rulelearningstatisticalsystems,princpiplesossiblyallowleadinforgatobvetteraluabletheoreticalinsights.invToestiga-this
end,weinvestigatetwotypesofrulesets,propositionallogicformulaein
formdisjunctivulae,eweinnormalvestigateformthe(DNFNP-hardformulae),problemandwofeightedempiricalrulerissets.kForminimiza-DNF
tionofDNFformulaeofatmostkterms.Weproposeastochasticlocal
search(SLS)algorithmtosolvethisproblemeﬃciently.Then,weapply
theSLSalgorithminastructuralriskminimizationproceduretogainsmall
yetpredictiverulesets.Alternatively,wecombineSLSwithanensem-
bleapproachtoreducetheestimationpartofthepredictionerrorandto
upper-bounditanalytically.Forweightedrulesets,wedeviseanovelconvex
optimizationcriterion,MarginMinusVariance(MMV),afeasiblerelaxation
oftheinfeasibleempiricalriskminimizationproblem.Forcapacitycontrol
wederiveanovelconcentrationinequality.Animplementationofthese
ideas,theRumblesystem,yieldsfavorableresultsinexperimentsdealing
withstructure-activityrelationshipsofsmallmolecules.Finally,formulti-
learningrelationalsystemslearning,weaccordingproptoosetheandﬂowuseofaframewinformationork.toWeassessevaluateanddiﬀerencomparet
strategiesanalyticallyandexperimentallyandpresentanovelrulegenera-
tioncriterionthataimsathighlydiverserulesets.

ZusammenfassungDieseDissertationuntersuchtRegellernenalseinstatistischesKlassiﬁkati-
onsproblem.RegellernenistseitlangemTeildesmaschinellenLernensund
bekanntdaf¨ur,einfachinterpretierbareundverst¨andlicheKlassiﬁziererzu
erzeugen.MethodendesstatistischenmaschinellenLernens,aufderande-
renSeite,konzentrierensich¨ublicherweiseaufdieVorhersagegenauigkeit,
des¨ofterenauchaufKostenderInterpretierbarkeit.DasZieldieserAr-
beitistes,diebeidenAns¨atzemiteinanderzukombinieren,umKlassiﬁ-
ziermithoherVorhersagegenauigkeitundInterpretierbarkeitzuerstellen.
AußerdemerlaubenstatistischePrinzipien,andersalsdieHeuristiken,die
oftinherk¨ommlichenRegellernernverwendetwerden,einebesseretheore-
tischeAnalyse,diem¨oglicherweisezuwertvollenEinsichtenf¨uhrenk¨onnte.
WiruntersuchenzudiesemZweckezweiArtenvonRegelmengen,n¨amlich
aussagenlogischeFormelnindisjunktiverNormalform(DNF-Formeln)und
gewichteteRegelmengen.ImFallederDNF-Formelnuntersuchenwirdas
NP-harteProblem,denempirischenFehlervonDNF-Formelnmitmaxi-
malkKonjunktionenzuminimieren.WirschlageneinenAlgorithmusba-
sierendaufstochastischerlokalerSuche(SLS)vor,umdasProblemeﬃzient
zul¨osen.DanachwendenwirSLSinnerhalbeinerMethodezurstruktu-
rellenFehler-Minimierungan,umkleineRegelmengenmithoherVorhersa-
gegenauigkeitzugenerieren.AlseinenalternativenAnsatzverbindenwir
SLSmitEnsembles,umdenAbsch¨atzungsanteildesVorhersagefehlerszu
minimierenundaufanalytischeWeiseabzusch¨atzen.ImFalledergewich-
tetenRegelmengenentwerfenwirdasneuekonvexeOptimierungskriterium
“AbstandminusVarianz”(engl.MarginMinusVariance,MMV),einerea-
lisierbareVereinfachungdespraktischnichtrealisierbarenProblems,den
empirischenFehlerzuminimieren.F¨urdieKapazit¨ats¨uberwachungleiten
wireineneueUngleichungzurDichteabsch¨atzungher.EineImplementation
dervorgestelltenIdeen,dasRumble-System,erzieltpositiveErgebnissein
ExperimentenmitStruktur-Aktivit¨ats-BeziehungenvonkleinenMolek¨ulen.
F¨urmulti-relationalesLernen,schließlich,entwerfenundverwendenwirein
theoretischesSystem,indemverschiedeneLernsystemeanhanddesInfor-
mationsﬂussesbewertetundverglichenwerdenk¨onnen.Wirbewertenver-
schiedeneStrategienaufanalytischeundexperimentelleArtundWeiseund
stelleneinneuesRegel-Generierungskriteriumvor,dasaufhoheDiversit¨at
abzielt.Regelmengenin

tswledgemenknoAc

Manypeoplethinkthatwritingadissertationisadaunting,time-consuming,
anddiﬃculttask,whichneedstobewell-plannedanddemandsahighdegree
ofdisciplineanddiligence.Thisisnotconsistentwithmyexperience.I
foundthatwritingadissertationisadaunting,time-consuming,anddiﬃcult
advtask,enturewhichandnevergenerallyprocee