A computational recognition system grounded in perceptual research [Elektronische Ressource] / vorgelegt von Christian Wallraven

eberhard_karls_universitat_tubingen

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

172 pages

English

Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

A propos
Informations
Extrait

Sujets

Informatik

Informations

Publié par	eberhard_karls_universitat_tubingen
Publié le	01 janvier 2007
Nombre de lectures	8
Langue	English
Poids de l'ouvrage	15 Mo

Extrait

A computational recognition system
grounded in perceptual research
Dissertation
zur Erlangung des Grades eines Doktors
der Naturwissenschaften
der Fakultät für Mathematik und Physik
der Eberhard Karls Universität Tübingen
vorgelegt von
Christian Wallraven
aus Kempen
2007ii
Tag der mündlichen Prüfung: 18. Oktober 2006
Dekan: Prof. Dr. N. Schopohl
1. Berichterstatter: Prof. Dr. H. Ruder / Prof. Dr. H. Bülthoff
2. Ber: Prof. Dr. B. Schölkopf
3. Ber: Prof. Dr. W. StraßerContents
1 Cognitivebasisofobjectrecognition 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Cognitive psychophysics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Structural versus view based approaches . . . . . . . . . . . . . . . . . . . . 4
1.2.2 View based recognition of faces . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 The canonical view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.4 Temporal aspects of object learning . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.5 Tal of object recognition . . . . . . . . . . . . . . . . . . . . . . 21
1.2.6 Conﬁguration and components . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3 Physiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.1 Visual processing in the brain . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.2 Beyond the traditional view . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2 Computationalapproachestoobjectrecognition 39
2.1 Data representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.1.1 Structured shape models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.1.2 Statistical appearance models . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Classiﬁcation algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.2.1 K means with n nearest neighbor . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.2 Radial basis function networks . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.2.3 Support vector machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3 Agenericframeworkforobjectlearningandrecognition 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Learning and recognizing objects using keyframes . . . . . . . . . . . . . . . . . . . 53
3.2.1 Related concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2.2 What deﬁnes a keyframe? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3 Discussion of the framework. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.3.1 Keyframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.3.2 Local visual features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.4 Computational implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.1 Visual features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.2 Matching of visual features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4.3 Recognition and Incremental Learning . . . . . . . . . . . . . . . . . . . . . . 61
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
iiiiv CONTENTS
4 Cognitivemodelingstudies 63
4.1 View based recognition of faces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.1.1 Feature matching - the horizontal prior . . . . . . . . . . . . . . . . . . . . . . 64
4.1.2 Modeling psychophysical experiments . . . . . . . . . . . . . . . . . . . . . . 65
4.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2 Conﬁguration and components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.1 The face representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.2 Feature matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2.3 Modeling psychophysical experiments . . . . . . . . . . . . . . . . . . . . . . 74
4.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3 Temporal aspects of recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.1 Modeling temporal contiguity by learning keyframes . . . . . . . . . . . . . . 80
4.3.2 The inﬂuence of morphing on feature tracking . . . . . . . . . . . . . . . . . . 81
4.3.3 Learning keyframes from morphed or scrambled sequences . . . . . . . . . . 84
5 ComputationalstudiesI-Keyframes 93
5.1 Geometric constraints for local feature matching . . . . . . . . . . . . . . . . . . . . 93
5.1.1 Geometric constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.1.2 Recognition under large view rotations . . . . . . . . . . . . . . . . . . . . . 94
5.2 Keyframe extraction for learning of object representations . . . . . . . . . . . . . . . 97
5.2.1 Parameters of keyframe extraction . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2.2 Real world sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.2.3 Recognition using keyframes . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3 Incremental build up of object representations . . . . . . . . . . . . . . . . . . . . . . 105
5.3.1 Parameters of incremental learning . . . . . . . . . . . . . . . . . . . . . . . . 106
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6 ComputationalstudiesII-SVMsandlocalfeatures 109
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.2 Support Vector Machines and local features . . . . . . . . . . . . . . . . . . . . . . . 110
6.3 Local kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.4.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.4.2 Results: View Generalizatiom . . . . . . . . . . . . . . . . . . . 117
6.4.3 Experimental SIFT versus Local Kernels. . . . . . . . . . . . . . . . 119
6.4.4 Results: Recognition under Noise . . . . . . . . . . . . . . . . 119
6.4.5 Experimental using position constraints . . . . . . . . . 120
6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7 ComputationalstudiesIII-SVMsandkeyframes 121
7.1 Algorithmic Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.1.1 Image Sequence Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.1.2 Feature Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.1.3 Image and Feature Matching for Kernel Machines . . . . . . . . . . . . . . . 125
7.1.4 Multi class SVMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.2 Computational Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.2.1 Database and Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.2.2 Classiﬁcation of Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.2.3 of Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.2.4 Experimental validation of positive deﬁniteness . . . . . . . . . . . . . . . . . 131
7.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132CONTENTS v
8 Generalconclusionandoutlook 133
8.1 A uniﬁed framework for object recognition . . . . . . . . . . . . . . . . . . . . . . . . 134
8.1.1 Categorization processing by feature correspondences . . . . . . . . . . . . 134
8.1.2 The role of context in object recognition . . . . . . . . . . . . . . . . . . . . . 136
8.1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.2 Multi modal keyframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.2.1 Psychophysics of visuo haptic object recognition . . . . . . . . . . . . . . . . 138
8.2.2 Multi modal keyframes - the view transition map . . . . . . . . . . . . . . . . 139
8.2.3 Computational experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.3 Categorization using SVMs and local features . . . . . . . . . . . . . . . . . . . . . . 146
8.3.1 Experiment 1 - Categorization using a controlled database . . . . . . . . . . 146
8.3.2 Categorization experiments in cluttered scenes . . . . . . . . . . . . . . . . . 146
8.3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Bibliography 151vi CONTENTSSummary
Inthisthesisacomputationalframeworkforvisualobjectrecognitionisdeveloped,whichisbased
on results from perceptual research. The motivation for this approach is given by the fact that
despiteseveraldecadesofresearchintheﬁeldofcomputervision,therestillexistsnorecognition
system which is able to match the vis