A logic-based approach to multimedia interpretation [Elektronische Ressource] / von Atila Kaya
206 pages
English

A logic-based approach to multimedia interpretation [Elektronische Ressource] / von Atila Kaya

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
206 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

A Logic-Based Approach toMultimedia InterpretationVom Promotionsausschuss derTechnischen Universit at Hamburg-Harburgzur Erlangung des akademischen GradesDoktor der Naturwissenschaften (Dr. rer. nat.)genehmigte DissertationvonAtila Kayaaus Izmir, Turk ei2011Reviewers:Prof. Dr. Ralf M ollerProf. Dr. Bernd NeumannProf. Dr. Rolf-Rainer GrigatDay of the defense:28.02.2011AbstractThe availability of metadata about the semantics of information in mul-timedia documents is crucial for building semantic applications that o erconvenient access to relevant information and services. In this work, wepresent a novel approach for the automatic generation of rich semanticmetadata based on surface-level information. For the extraction of therequired surface-level information state-of-the-art analysis tools are used.The approach exploits a logic-based formalism as the foundation for knowl-edge representation and reasoning. To develop a declarative approach, weformalize a multimedia interpretation algorithm that exploits formal infer-ence services o ered by a state-of-the-art reasoning engine. Furthermore,we present the semantic interpretation engine, a software system that im-plements the logic-based multimedia interpretation approach, and test itthrough experimental studies. We use the results of our tests to evaluatethe tness of our logic-based approach in practice. Finally, we conclude thiswork by highlighting promising areas for future work.

Sujets

Informations

Publié par
Publié le 01 janvier 2010
Nombre de lectures 25
Langue English
Poids de l'ouvrage 3 Mo

Extrait

A LogicBased Approach to MultimediaInterpretation
Vom Promotionsausschuss der Technischen Universität HamburgHarburg zur Erlangung des akademischen Grades Doktor der Naturwissenschaften (Dr. rer. nat.) genehmigte Dissertation
von
Atila Kaya
aus Izmir, Türkei
2011
Reviewers: Prof. Dr. Ralf Möller Prof. Dr. Bernd Neumann Prof. Dr. RolfRainer Grigat
Day of the defense: 28.02.2011
Abstract
The availability of metadata about the semantics of information in mul timedia documents is crucial for building semantic applications that offer convenient access to relevant information and services. In this work, we present a novel approach for the automatic generation of rich semantic metadata based on surfacelevel information. For the extraction of the required surfacelevel information stateoftheart analysis tools are used. The approach exploits a logicbased formalism as the foundation for knowl edge representation and reasoning. To develop a declarative approach, we formalize a multimedia interpretation algorithm that exploits formal infer ence services offered by a stateoftheart reasoning engine. Furthermore, we present the semantic interpretation engine, a software system that im plements the logicbased multimedia interpretation approach, and test it through experimental studies. We use the results of our tests to evaluate the fitness of our logicbased approach in practice. Finally, we conclude this work by highlighting promising areas for future work.
To my dear parents and wife Sevgili anneme, babama ve esime .
i
Acknowledgements
This thesis is the result of five years work in the Institute for Software Systems (STS) research group at the Hamburg University of Technology (TUHH). I am grateful to my advisor Prof. Dr. Ralf Möller for giving me the opportunity to conduct such exciting research and mentoring me. I would also like to thank Prof. Dr. Bernd Neumann and Prof. Dr. Rolf Rainer Grigat for reviewing this work.
I would like to express my gratitude to all my colleagues at the STS re search group: Sofia Espinosa, Sylvia Melzer, Alissa Kaplunova, Tobias Näth, Kamil Sokolski, Maurice Rosenfeld, Oliver Gries, Anahita Nafissi, Dr. HansWerner Sehring, Olaf Bauer, Rainer Marrone, Sebastian Wan delt, Volker Menrad and Gustav Munkby. Special thanks go to Dr. Patrick Hupe and Dr. Michael Wessel, who always supported and encouraged me.
I am also indebted to STS staff Hartmut Gau, Ulrike Hantschmann, Thomas Rahmlow, Thomas Sidow for their excellent administrative and technical support.
Finally, I would like to thank my parents Tükez and Dursun, and my wife Justyna for their love, care and continuous support.
ii
Contents
List of Figures
1
2
3
Introduction 1.1 Motivation for this Research . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Dissemination Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . .
Logical Formalization of Multimedia Interpretation 2.1 Applications and Related Research Fields . . . . . . . . . . . . . . . . . 2.2 Related Work On Image Interpretation . . . . . . . . . . . . . . . . . . . 2.2.1 Image Interpretation Based on Model Generation . . . . . . . . . 2.2.2 Image Interpretation Based on Abduction . . . . . . . . . . . . . 2.2.3 Image Interpretation Based on Deduction . . . . . . . . . . . . . 2.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Logical Engineering of a Multimedia Interpretation System 3.1 Knowledge Representation Formalisms . . . . . . . . . . . . . . . . . . . 3.1.1 Introduction to Description Logics . . . . . . . . . . . . . . . . . 3.1.2 Introduction to Logic Programming . . . . . . . . . . . . . . . . 3.2 Overview of a Multimedia Interpretation System . . . . . . . . . . . . . 3.3 Formalizing ABox Abduction . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Related Work on Abduction . . . . . . . . . . . . . . . . . . . . . 3.3.2 The ABox Abduction Algorithm . . . . . . . . . . . . . . . . . . 3.3.3 Selecting Preferred Explanations . . . . . . . . . . . . . . . . . .
iii
v
3 3 4 5 6 9
11 12 16 17 20 25 30
33 35 38 54 59 66 68 83 89
4
5
6
3.4 3.5
AbductionBased Interpretation . . . . . . . . . . . . . . . . . . . . . . . Fusion of ModalitySpecific Interpretations . . . . . . . . . . . . . . . . .
95 99
Case Studies 105 4.1 The BOEMIE Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.2 The Semantic Interpretation Engine . . . . . . . . . . . . . . . . . . . . 110 4.3 Interpretation of a Sample Multimedia Document . . . . . . . . . . . . . 113 4.3.1 ModalitySpecific Interpretations . . . . . . . . . . . . . . . . . . 114 4.3.2 Strategies for the Interpretation Process . . . . . . . . . . . . . . 138 4.3.3 Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Evaluation 161 5.1 Performance and Scalability . . . . . . . . . . . . . . . . . . . . . . . . . 162 5.2 Quality of Interpretation Results . . . . . . . . . . . . . . . . . . . . . . 168
Conclusions 175 6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 6.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
References
Index
179
194
List
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
4.1
4.2 4.3
of
Figures
The hybrid approach for obtaining deep semantic annotations . . . . . . Interpretation of complex concept descriptions . . . . . . . . . . . . . . A graphical representation of the concept definitionP erson, which re quires modeling of a triangular structure . . . . . . . . . . . . . . . . . . A graphical representation of an ABox with an inferred role assertion (dashed) caused by the transitive role R . . . . . . . . . . . . . . . . . . An example UML class diagram . . . . . . . . . . . . . . . . . . . . . . . An example TBoxT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The multimedia interpretation process. Input: analysis ABox, Output: interpretation ABox(es), The background knowledge: Domain ontology and interpretation rules . . . . . . . . . . . . . . . . . . . . . . . . . . . Interpretation of a document consisting of observations and their expla nations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The multimedia interpretation approach including processing steps for analysis, interpretation and fusion . . . . . . . . . . . . . . . . . . . . . A rule used by the Wimp3 system for network construction . . . . . . . The Bayesian network constructed for plan recognition . . . . . . . . . .
34 40
50
51 52 53
60
62
64 73 74
The architecture of the semantic interpretation engine, which is deployed into the Apache Tomcat servlet container. The Apache Axis is a core engine for web services. The semantic interpretation engine exploits the inference services offered by RacerPro. Each RacerPro instance is dedicated to a single modality. . . . . . . . . . . . . . . . . . . . . . . . 111 A sample web page with athletics news . . . . . . . . . . . . . . . . . . . 115 The image taken from the sample web page in Figure 4.2 . . . . . . . . 116
v
4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 4.22 4.23 4.24 4.25 4.26 4.27 4.28
The ABoximageABox01representing the results of image analysis for the image in Figure 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 An excerpt of the TBoxTfor the athletics domain . . . . . . . . . . . . 117 An excerpt of the image interpretation rulesRimafor the athletics domain117 The ABoxAafter the addition of Δ1. . . . . . . . . . . . . . . . . . . 120 The interpretation ABoxesimageABox01 interpretation1andimageABox01 interpretation2 returned by the semantic interpretation engine . . . . . . . . . . . . . . 123 The caption of the image shown in Figure 4.3 . . . . . . . . . . . . . . . 123 The ABoxcaptionABox01representing the results of text analysis for the caption in Figure 4.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Another excerpt of the TBoxTfor the athletics domain . . . . . . . . . 125 An excerpt of the caption interpretation rulesRcapfor the athletics domain125 The interpretation ABoxcaptionABox01 interpretation1returned by the semantic interpretation engine . . . . . . . . . . . . . . . . . . . . . . . . 129 The first paragraph of the text segment of the sample web page . . . . . 129 The ABoxtextABox01representing the results of text analysis for the text segment in Figure 4.14 . . . . . . . . . . . . . . . . . . . . . . . . . 130 Another excerpt of the TBoxTfor the athletics domain . . . . . . . . . 131 An excerpt of the text interpretation rulesRtex131for the athletics domain The ABoxAafter the addition of the explanation Δ2134. . . . . . . . . . The interpretation ABoxtextABox01 interpretation1returned by the semantic interpretation engine . . . . . . . . . . . . . . . . . . . . . . . . 137 The ABoxsampleABox1. . . . . . . . . . . . . . . . . . . . . . . . . . . 139 A sample TBoxT140. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A set of text interpretation rulesR1. . . . . . . . . . . . . . . . . . . . 140 Two possible interpretation results for the same analysis ABoxsam pleABox1141, where the one on the lefthand side is preferred . . . . . . . . The ABoxsampleABox2142. . . . . . . . . . . . . . . . . . . . . . . . . . . A set of text interpretation rulesR2containing a single rule . . . . . . . 142 Two different interpretation results for the analysis ABoxsampleABox2, where the one on the lefthand side is preferred . . . . . . . . . . . . . . 144 The sample analysis ABoxsampleABox3145. . . . . . . . . . . . . . . . . A set of text interpretation rulesR3145. . . . . . . . . . . . . . . . . . . .
4.29 4.30 4.31 4.32 4.33 4.34 4.35
5.1
5.2
5.3
5.4
Two different interpretation results for the analysis ABoxsampleABox3, where the one on the lefthand side is preferred . . . . . . . . . . . . . . 146 An excerpt of the axioms, which are added to the background knowledgeT149 All assertions of the interpretation ABoxcaptionABox01 interpretation1 as returned by the semantic interpretation engine . . . . . . . . . . . . . 152 The analysis ABox of a sample web page . . . . . . . . . . . . . . . . . . 156 A sample image interpretation ABox . . . . . . . . . . . . . . . . . . . . 156 A sample caption interpretation ABox . . . . . . . . . . . . . . . . . . . 157 The fused interpretation ABox of the sample web page . . . . . . . . . . 160
The number of fiat assertions (x) and the time (y) spent in minutes for the interpretation of 500 text analysis ABoxes. . . . . . . . . . . . . . . 164 The number of fiat assertions (x) and the time (y) spent in minutes for the interpretation of selected text analysis ABoxes. . . . . . . . . . . . . 165 The sum of fiat and bona fide assertions (x) and the time (y) spent in minutes for the interpretation of 500 text analysis ABoxes. . . . . . . . 166 The number of fiat and bona fide assertions (x) and the time (y) spent in minutes for the interpretation of selected text analysis ABoxes. . . . . 168
  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents