La lecture à portée de main
Découvre YouScribe en t'inscrivant gratuitement
Je m'inscrisDécouvre YouScribe en t'inscrivant gratuitement
Je m'inscrisDescription
Sujets
Informations
Publié par | rheinisch-westfalischen_technischen_hochschule_-rwth-_aachen |
Publié le | 01 janvier 2008 |
Nombre de lectures | 7 |
Langue | English |
Poids de l'ouvrage | 6 Mo |
Extrait
Efficient adaptive retrieval and mining
in large multimedia databases
Von der Fakult¨ at fur¨ Mathematik, Informatik und Naturwissenschaften
der Rheinisch-Westf¨ alischen Technischen Hochschule Aachen
zur Erlangung des akademischen Grades einer Doktorin der
Naturwissenschaften genehmigte Dissertation
vorgelegt von
Diplom-Informatikerin
Ira Assent
aus Aachen
Berichter: Universit¨ atsprofessor Dr. rer. nat. Thomas Seidl
Professor Dr. Christian Jensen
Tag der mundlic¨ hen Prufung:¨ 26. Februar 2008
Diese Dissertation ist auf den Internetseiten
der Hochschulbibliothek online verfugbar¨To my late parentsContents
Acknowledgements 1
Abstract 2
Zusammenfassung 4
Preliminaries 7
I Retrieving histogram data 17
1 Introduction 18
2 Histogram retrieval 20
2.1 Distance functions ........................2
2.2 The Earth Mover’s Distance ...................24
3 Multistep retrieval 28
3.1 Multistep filter-and-refine architectures .............28
3.2 Query processing .30
3.3 Filter properties..........................32
4 EMD lower bounds 35
4.1 3D-averaging lower bound ....................35ii CONTENTS
4.2 L -norm-based lower bounds . . .................36p
4.3 Independent minimization lower bound .............42
4.4 Multistep filter concept......................47
4.5 Experiments ............................50
4.6 Conclusion.............................56
5 Video stream retrieval 58
5.1 Similarity search and prioritizing ................60
5.2 System architecture........................61
5.3 The Query Processor .......................63
6 Exact indexing 66
6.1 Introduction ............................66
6.2 Related work ...........................68
6.3 Indexing of the Earth Mover’s Distance .............69
7 Vector approximation 75
7.1 Quantization and lookup table..................76
7.2 Lookup table lower bound ....................78
7.3 Experiments ............................84
7.4 Conclusion.............................91
8 Dimensionality reduction 92
8.1 Reduced Earth Mover’s Distance ................96
8.2 Query processing algorithm ...................98
8.3 Experiments ............................9
8.4 Conclusion103CONTENTS iii
II Retrieving time series data 104
9 Time series retrieval 105
9.1 Introduction ............................106
9.2 Related work ...........................108
9.3 Time series similarity search ...................112
10 The TS-tree 118
10.1 Query processing in TS-trees125
10.2 Experiments ............................138
10.3 Conclusion.............................151
III Mining histogram data 153
11 Subspace clustering 154
11.1 Introduction15
11.2 Related work ...........................157
12 EDSC 159
12.1 Depth-first multistep algorithm . . ...............163
12.2 Experiments ............................176
12.3 Conclusion.............................183
IV Mining time series data 184
13 Time series mining 185
13.1 Introduction ............................185
13.2 Related work ...........................18
13.3 Cluster model189iv CONTENTS
14 Efficient cluster mining 197
14.1 Monotonicity . . . ........................197
14.2 Indexing time series clusters ...................201
14.3 Clusters of arbitrary length202
14.4 Multicluster algorithm ......................203
14.5 Experiments ............................210
14.6 Conclusion.............................21
Conclusion 223
Bibliography 228List of Figures
1 Examples of multimedia data .................. 7
2 Similarity search .........................10
3 Range query and nearest neighbor query............10
4 Distance between histograms1
5 The knowledge discovery process ................13
6 Multistep query processing ...................15
Retrieving histogram data
2.1 Motivation of the Earth Mover’s Distance ...........23
2.2 Iso-lines of the Earth Mover’s Distance ............26
3.1 Minimum bounding rectangles geometrically .........29
3.2 Hierarchy of minimum bounding rectangles ..........30
3.3 Multistep retrieval .......................31
3.4 ICES filter criteria for multistep retrieval ...........3
4.1 Iso-contours of weighted L norms ...............37p
4.2 EMD flows between histograms ................38
4.3 Multistep concept for lower bounds of EMD..........48
4.4 Exp.: scalability, selectivity ...................49
4.5 Exp.: scalability, runtime ....................50vi LIST OF FIGURES
4.6 Exp.: dimensionality, selectivity ................51
4.7 Exp.:y, runtime..................52
4.8 Exp.: result size, selectivity ...................53
4.9 Exp.: result size, runtime ....................54
4.10 Exp.: multistep algorithm, selectivity .............5
4.11 Exp.: multistep runtime...............56
5.1 Concept of the AttentionAttractor59
5.2 Architecture of the AttentionAttractor.............61
5.3 Setup Dialog of the AttentionAttractor ............63
5.4 The demo system live and screenshot65
6.1 mindist for the Earth Mover’s Distance67
7.1 Quantization . . . ........................76
Algorithm 1: lookup table entry computation .................80
7.2 Mass distribution cases .....................81
7.3 Flows in minimization82
7.4 Multistep algorithm .......................84
7.5 Exp.: dimensionality, selectivity ................85
7.6 Exp.:y, number of EMD computations .....86
7.7 Exp.: dimensionality, runtime..................87
7.8 Exp.: scalability, runtime ....................8
7.9 Exp.:y, number of EMD computations .......89
7.10 Exp.: result size, runtime90
7.11 Exp.: dimensionality, runtime for two data sets ........91
8.1 Dimensionality reduction and lower bounding .........93
8.2y ....................94LIST OF FIGURES vii
8.3 Clustering on EMD cost matrix entries. ............97
8.4 Multistep setup for dimensionality reduction . . . ......99
8.5 Exp.: data ............................10
8.6 Exp.: reduced dimensionality, runtime .............101
8.7 Exp.: result size, runtime ....................101
8.8 Exp.: scalability, response time .................102
Retrieving time series data
9.1 R-tree indexing of time series . . . ...............109
9.2 TS-tree indexing for time series.................10
9.3 Time series data example ....................13
10.1 Separators ............................119
10.2 Quantized separators ......................120
10.3 Descriptor in TS-trees geometrically ..............123
10.4 Structure of TS-trees123
10.5 Mindist to meta data125
10.6 Mindist to separators126
10.7 Mindist to intersection of meta data and separators .....127
10.8 Lexicographic mindist computation ..............128
10.9 separator end...................130
10.10 Lexicographic mindist ................131
10.11 Euclidean Distance and Dynamic Time Warping .......13
10.12 Multistep query processing135
10.13 Optimal multistep query processing for time series . .....136
10.14 Exp.: capacity ..........................141
10.15 Exp.: compactness per level...................142