GEOGRAPHY 4xx GEOPOLITICS

GEOGRAPHY 4xx GEOPOLITICS

English
251 pages
Lire
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

  • cours magistral - matière potentielle : theme
  • cours - matière potentielle : time
  • cours magistral
  • cours magistral - matière potentielle : notes
  • cours - matière potentielle : thought
  • cours magistral - matière potentielle : per week
  • cours magistral - matière : geography
  • cours - matière potentielle : syllabus
Mathew Coleman Department of Geography, Ohio State GEOGRAPHY 465 : GLOBAL POLITICS AND THE MODERN GEOPOLITICAL IMAGINATION Instructor: Professor M. Coleman Office: 1156 Derby Hall Office Hours:___________ and/or by appointment Email: (Please put “G465” in subject line) Office Tel: (614) 292-9686 ***PROPOSED NEW COURSE** Course Rationale International Relations (IR) is commonly understood as the disciplinary home for research on war and conflict.
  • scientific basis of geopolitical theory- building
  • geographical closure of global space
  • geopolitical research
  • geopolitics
  • ohio-state
  • ohio state
  • political geography
  • lecture
  • geography
  • world war
  • power

Sujets

Informations

Publié par
Nombre de lectures 32
Langue English
Poids de l'ouvrage 1 Mo
Signaler un problème

INCREMENTAL LEARNING OF DISCRETE HIDDEN MARKOV MODELS
By
German Florez-Larrahondo
A Dissertation
Submitted to the Faculty of
Mississippi State University
in Partial Fulfillment of the Requirements
for the Degree of Doctor of Philosophy
in Computer Science
in the Department of Computer Science and Engineering
Mississippi State, Mississippi
August 2005INCREMENTAL LEARNING OF DISCRETE HIDDEN MARKOV MODELS
By
German Florez-Larrahondo
Approved:
Susan M. Bridges Julia E. Hodges
Professor of Computer Science Professor of Computer Science
and Engineering and Engineering
(Major Professor) (Committee Member)
Rayford B. Vaughn Eric Hansen
Professor of Computer Science Associate Professor of Computer Science
and Engineering and Engineering
(Committee Member) (Committee Member)
David Dampier Kirk H. Schulz
Assistant Professor of Computer Science Dean of the Bagley College
and Engineering of Engineering
(Committee Member)Name: German Florez-Larrahondo
Date of Degree: August 6, 2005
Institution: Mississippi State University
Major Field: Computer Science
Major Professor: Dr. Susan M. Bridges
Title of Study: INCREMENTAL LEARNING OF DISCRETE HIDDEN MARKOV
MODELS
Pages in Study: 227
Candidate for Degree of Doctor of Philosophy
We address the problem of learning discrete hidden Markov models from very long se-
quences of observations. Incremental versions of the Baum-Welch algorithm that approx-
imate theβ-values used in the backward procedure are commonly used for this problem
since their memory complexity is independent of the sequence length. However, tradi-
tional approaches have two main disadvantages: the approximation of theβ-values devi-
ates far from the real values, and the learning algorithm requires previous knowledge of
the topology of the model.
This dissertation describes a new incremental Baum-Welch algorithm with a novel
backward procedure that improves the approximation of theβ-values based on a one-step
lookahead in the training sequence and investigates heuristics to prune unnecessary states
from an initial complex model. Two new approaches for pruning, greedy and controlled,are introduced and a novel method for identification of ill-conditioned models is presented.
Incremental learning of multiple independent observations is also investigated.
We justify the new approaches analytically and report empirical results that show
they converge faster than the traditional Baum-Welch algorithm using fewer computer
resources. Furthermore, we demonstrate that the new learning algorithms converge faster
than the previous incremental approaches and can be used to perform online learning of
high-quality models useful for classification tasks.
Finally, this dissertation explores the use of the new algorithms for anomaly detection
in computer systems, that improve our previous research work on detectors based on hid-
den Markov models integrated into real-world monitoring systems of high-performance
computers.DEDICATION
I dedicate this dissertation to the memory of my father, and to my family.
iiACKNOWLEDGMENTS
I express my gratitude to Dr. Susan M. Bridges for her endless patience, invaluable
feedback, and most importantly, for helping me grow professionally. Dr. Bridges not only
motivated me to direct my research in the areas of machine learning and computer security,
but also encouraged me to develop my own ideas.
I am also grateful to the other members of my committee for their useful reviews and
suggestions. Special thanks to Dr. Rayford Vaughn for such an enthusiasm concerning my
research work. I am thankful for the financial support and the research opportunities that
he and Dr. Bridges have made available for me in the last years. I offer respectful thanks
to Dr. Eric Hansen who has provided invaluable critiques and who has helped me polish
my research work. I thank Dr. Julia Hodges for giving me the opportunity of being in this
wonderful country while pursuing a Ph.D. degree. Also, I am grateful to her and to Dr.
David Dampier for giving me the opportunity to teach a class for undergraduate students
—an invaluable experience.
I wish to acknowledge the members of the Center for Computer Security Research at
Mississippi State University. In particular, I thank Ambareen Siraj, Zhen Liu, and Wei Li,
who have always shown interest in my research work. I am also extremely thankful to
Yong Wang, who helped me to understand the many issues involved in text classification
problems and provided me with some of the tools used in the preprocessing of text data.
iiiI am also indebted to many other members of the faculty and staff at the Department
of Computer Science and Engineering. Special thanks to Brenda Collins and Jo Coleson
for the assistance provided in the last years.
Finally, I would like to acknowledge the following agencies that have partially funded
my research:
• Army Research Laboratory, grant DAAD17-01-C-0011;
• Office of Naval Research, grant N00014-01-1-0678;
• National Security Agency - IASP, grant H98230-04-1-0205;
• National Science Foundation - Cyber Trust, grant SCI-0430354.
ivTABLE OF CONTENTS
Page
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF SYMBOLS, ABBREVIATIONS, AND NOMENCLATURE . . . . . . . xvii
CHAPTER
I. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Learning Models from Sequential Data . . . . . . . . . . . . . . . . . . 2
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
II. LITERATURE REVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1 Machine Learning for Discrete Sequential Data . . . . . . . . . . . . . 16
2.2 Discrete Hidden Markov Models (HMM) . . . . . . . . . . . . . . . . 19
2.2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.2 The Baum-Welch Algorithm . . . . . . . . . . . . . . . . . . . 22
2.2.3 The SegmentalK-Means Algorithm . . . . . . . . . . . . . . . 28
2.2.4 Scaling and Other Implementation Issues . . . . . . . . . . . . . 30
2.2.5 Training an HMM with Multiple Observations . . . . . . . . . . 31
2.2.6 Similarity Between Models . . . . . . . . . . . . . . . . . . . . 32
2.2.7 Incremental Estimation of HMMs . . . . . . . . . . . . . . . . 33
2.2.8 Topology Learning . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.9 Extending the Capabilities of an HMM . . . . . . . . . . . . . . 38
vCHAPTER Page
2.3 A Brief Discussion of Continuous Observations . . . . . . . . . . . . . 42
2.4 Component Analysis for HMMs . . . . . . . . . . . . . . . . . . . . . 43
2.5 The Classification Problem via HMMs . . . . . . . . . . . . . . . . . . 44
2.5.1 Viterbi Decoding . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5.2 Binary Classification . . . . . . . . . . . . . . . . . . . . . . . 45
2.5.3 Model-Class Framework UsingP(O|λ) . . . . . . . . . . . . . 45
2.5.4 Model-Class Frameworks Using Other Score Functions . . . . . 46
2.6 Applications of HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7 Conditional Relative Entropy of a Source . . . . . . . . . . . . . . . . . 50
III. INCREMENTAL PARAMETER LEARNING OF A DISCRETE HMM . . . 51
3.1 The Incremental Baum-Welch Algorithm: IBW . . . . . . . . . . . . . 51
3.1.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.2 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.3 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . 59
3.1.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.5 Smoothing the Learning Process . . . . . . . . . . . . . . . . . 66
3.1.6 Comparison of BW and IBW . . . . . . . . . . . . . . . . . . . 71
3.2 Improving theβ-values: The IBW+ Algorithm . . . . . . . . . . . . . . 77
3.2.1 A New Backward Procedure . . . . . . . . . . . . . . . . . . . 77
3.2.2 Impact of the New Backward Procedure on the Learning . . . . 82
3.3 Online Learning of Multiple Sequences: The mIBW+ Algorithm . . . . 85
3.4 ApproximatingP(O|λ) Incrementally . . . . . . . . . . . . . . . . . . 90
3.5 Additional Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.5.1 Comparison of the Convergence Rates of BW, IBW and IBW+ . 94
3.5.2 Comparison of the Convergence Rates of mBW and mIBW+ . . 100
IV. INCREMENTAL TOPOLOGY LEARNING OF A DISCRETE HMM . . . . 103
4.1 Removing Low State-transition Probabilities . . . . . . . . . . . . . . . 107
4.2 Removing the Least Visited States . . . . . . . . . . . . . . . . . . . . 109
4.2.1 Greedy Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2.2 Controlled Pruning . . . . . . . . . . . . . . . . . . . . . . . . 117
4.3 Discarding Ill-Conditioned HMM Parameters . . . . . . . . . . . . . . 123
4.3.1 Singular Value Decomposition and Condition Number . . . . . . 124
4.3.2 A New Test Based on the Inverse Condition Number ofC =A|B 126
4.3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.4 Path Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
viCHAPTER Page
4.5 Implementation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.6 A summary of tpIBW+ . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.7 Comparison of tpIBW+ and IBW+ . . . . . . . . . . . . . . . . . . . . 134
4.8 Additional Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 138
V. APPLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.1 Single-Class Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.1.1 Area Under the ROC Curve . . . . . . . . . . . . . . . . . . . . 145
5.1.2 Score Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.1.2.1 The Negative of the Log Likelihood . . . . . . . . . . 148
5.1.2.2 Counter of Anomalies . . . . . . . . . . . . . . . . . . 148
5.1.3 An Example of a Score Function and an ROC curve . . . . . . . 149
5.1.4 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . 152
5.1.5 Experiment I: Impact ofN and Data Noise in Classification . . . 153
5.1.5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.1.5.2 Experimental Results . . . . . . . . . . . . . . . . . . 156
5.1.5.3 The Synthesis vs. Recognition Tradeoff . . . . . . . . 164
5.1.6 Experiment II: Comparison of the Score Functions . . . . . . . . 166
5.1.6.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 166
5.1.6.2 Experimental Results . . . . . . . . . . . . . . . . . . 174
5.2 Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.2.1 Convergence and Computer Resources . . . . . . . . . . . . . . 179
5.2.2 Online Learning of Baseline-Models . . . . . . . . . . . . . . . 182
5.2.3 The Number of Hidden States and its Relation with Accuracy . . 183
5.2.4 Online Detection of Anomalies . . . . . . . . . . . . . . . . . . 185
5.2.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . 185
5.2.4.2 The Online Log Likelihood . . . . . . . . . . . . . . . 186
5.2.4.3 Experimental Results . . . . . . . . . . . . . . . . . . 187
VI. CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . 192
6.1 Results and Contributions . . . . . . . . . . . . . . . . . . . . . . . . . 192
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
APPENDIX
A. THEORETICAL CONVERGENCE PROPERTIES OF BW AND IBW . . . 207
vii