Hyperstructure-based search methods for the world wide web [Elektronische Ressource] / von Zhanzi Qiu
188 pages
English

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris

Hyperstructure-based search methods for the world wide web [Elektronische Ressource] / von Zhanzi Qiu

-

Découvre YouScribe en t'inscrivant gratuitement

Je m'inscris
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus
188 pages
English
Obtenez un accès à la bibliothèque pour le consulter en ligne
En savoir plus

Description

Hyperstructure-Based Search Methods for the World Wide Web Vom Fachbereich Informatik der Technischen Universität Darmstadt genehmigte Dissertation zur Erlangung des akademischen Grades eines Doktor-Ingenieurs (Dr.-Ing.) von Zhanzi Qiu (Master of Science) geboren in Fujian, China Referent: Prof. Dr. Erich J. Neuhold Koreferent: Prof. Dr. James Geller Tag der Einreichung: 19.01.2004 Tag der mündlichen Prüfung: 22.03.2004 Darmstadt 2004 D17 Darmstädter Dissertation i Abstract Keywords: hyperstructure, search methods, World Wide Web, XML/RDF This thesis presents several hyperstructure-based Web search methods and a prototype system that is designed to implement the methods. Given the context of hyperlink structural and semantic information that is representable with new Web standards, this thesis is an effort to answer the open question of how to efficiently make use of such information for searching the Web and filtering and retrieving relevant information. The hyperstructure-based approach taken in this thesis is an extension to the traditional structure-based search method, which mainly handles hierarchical structures (composed by non-linking mechanisms) in structured documents (e.g., XML). In addition to such hierarchical structures, this approach can also handle both hierarchical and non-hierarchical structures composed by linking mechanisms.

Sujets

Informations

Publié par
Publié le 01 janvier 2004
Nombre de lectures 21
Langue English
Poids de l'ouvrage 1 Mo

Extrait







Hyperstructure-Based Search Methods for the
World Wide Web

Vom Fachbereich Informatik
der Technischen Universität Darmstadt
genehmigte

Dissertation

zur Erlangung des akademischen Grades eines
Doktor-Ingenieurs (Dr.-Ing.)


von
Zhanzi Qiu
(Master of Science)
geboren in Fujian, China


Referent: Prof. Dr. Erich J. Neuhold
Koreferent: Prof. Dr. James Geller
Tag der Einreichung: 19.01.2004
Tag der mündlichen Prüfung: 22.03.2004


Darmstadt 2004
D17
Darmstädter Dissertation








i





Abstract
Keywords: hyperstructure, search methods, World Wide Web, XML/RDF

This thesis presents several hyperstructure-based Web search methods and a
prototype system that is designed to implement the methods. Given the context of
hyperlink structural and semantic information that is representable with new Web
standards, this thesis is an effort to answer the open question of how to efficiently
make use of such information for searching the Web and filtering and retrieving
relevant information.
The hyperstructure-based approach taken in this thesis is an extension to the
traditional structure-based search method, which mainly handles hierarchical
structures (composed by non-linking mechanisms) in structured documents (e.g.,
XML). In addition to such hierarchical structures, this approach can also handle both
hierarchical and non-hierarchical structures composed by linking mechanisms.
Compared to other link-based approaches that largely take into account the quantity of
links in their search methods, this approach also makes use of the semantic
information in links and link-based structures. It is in line with the trend of Web
development with regard to capturing rich structural and semantic information and
thereby capitalizing on the potential of new search methods.
The hyperstructure-based search methods presented in this thesis can be applied to
improve the search quality on the Web as the Web evolves from a poorly structured to
a more structured, semantic-rich network. More concretely, by making use of
hypertext composites and contexts, the search results can be more specific with
respect to users’ information needs, and additionally, the users’ efforts to interpret the
search results can be reduced. Presenting structured search results based on hypertext
composites as inter-linked nodes/pages rather than separate nodes/pages helps users
ii





understand the retrieved information better. By making use of semantic information in
hyperstructures (e.g., types of links and nodes), better filters can be developed for
selecting and ranking the Web pages retrieved by search systems. These pages can be
either intermediate information for further processing or final search results presented
to users. By making use of domain models, domain-specific structure-based search
methods can be developed, which may generate better results than general search
methods that do not understand the domain-specific information.
iii





Kurzfassung
Schlüsselwörter: Hyperstruktur, Suchmethoden, World Wide Web, XML/RDF

Verschiedene neue Internet-Standards, vor allem XML und RDF, versprechen zwar
eine Verbesserung im Zugang zu den Informationen im Internet. Bisher ist es jedoch
unklar, wie die neuen Strukturen und semantischen Informationen, die durch diese
Standards ausgedruckt werden können, für Informationssuche am besten eingesetzt
werden können. Diese Arbeit hat hierauf eine Antwort gegeben. Sie präsentiert vier
verschiedene Hyperstruktur-basierte Suchmethoden und ein prototypisches
Suchsystem. Einige Experimente wurden auch durchgefuehrt. Die Ergebnisse zeigen,
daß mit den neuen Suchmethoden folgendes erreicht werden kann:
• Neuartige formular-basierte Queries können gestellt werden.
• Suchergebnisse können in ihrem ursprünglichen Kontext gesichtet werden
(d.h. innerhalb eines Dokuments oder einer Gruppe von Dokumenten).
Dadurch können Benutzer die Relevanz besser beurteilen.
• Bessere Filter für die Auswahl und Sortierung nach Relevanz können
entwickelt werden, bevor die gefundenen Informationen bearbeitet und dem
Benutzer präsentiert werden.
• Domänenspezifische Suchmethoden können entwickelt werden, die bessere
Ergebnisse als allgemeine Suchmethoden liefern, da letztere
domänenspezifische Information nicht "verstehen".
iv






v





Acknowledgements
I would like to express my deep gratitude to my supervisor, Prof. Dr. Erich J.
Neuhold, for his advice, support, and encouragement throughout my doctoral
programme. I would like to thank my second advisor, Prof. Dr. James Geller, for his
comments and suggestions on the thesis.
A number of other people deserve special acknowledgements. I thank Dr. Matthias
Hemmje for his advice and guidance all through the way in completing this thesis. I
thank Dr. Weigang Wang, Dr. Ulrich Thiel and Dr. Reginald Ferber for their valuable
advice that helped me make my thesis topic clear. Also, I thank Prof. Dr. Klaus
Mätzel for his advice and support during the period I worked in his division.
This thesis was done with the financial and technical support of GMD-IPSI, now
FhG-IPSI. All the kind help of the colleagues in the institute, especially in the
divisions DELITE, TOPAS and the institute office, are appreciated. The support by
the Computer Science Department, Darmstadt University of Technology deserves
special thanks.
Finally, I would like to express my sincere appreciation to my family for their
support and understanding in helping me to complete this thesis. This thesis is
especially dedicated to my dad, Ruizheng Qiu, who is always with me in my heart.
vi






vii





Table of Contents
Abstract......................................................................................................................i
Kurzfassung.............................................................................................................iii
Acknowledgements...................................................................................................v
1 Introduction ...........................................................................................................1
1.1 The Problem............................................................................................................. 1
1.2 Existing Search Methods ......................................................................................... 2
1.3 My Approach 3
1.4 Research Methods.................................................................................................... 3
1.5 Innovations and Limitations .................................................................................... 4
1.6 Definitions of Terms................................................................................................ 4
1.7 Structure of the Document....................................................................................... 5
2 Background and Related Work ...........................................................................7
2.1 Application Areas 7
2.1.1 Digital Libraries ............................................................................................. 7
2.1.2 Knowledge Management ............................................................................... 8
2.2 Related Basic Research Fields................................................................................. 9
2.2.1 Information Retrieval Issues .......................................................................... 9
2.2.1.1 An Information Retrieval System.............................................................. 10
2.2.1.2 Measures of Retrieval Effectiveness ......................................................... 12
2.2.2 Hypertext Issues........................................................................................... 13
2.2.2.1 Hypertext................................................................................................... 13
2.2.2.2 Hypertext Challenges ................................................................................ 14
2.2.2.3 Dexter Hypertext Reference Model........................................................... 15
viii





2.2.2.4 Semantic Net and Hypertext...................................................................... 17
2.2.3 WWW Issues................................................................................................ 18
2.2.3.1 Limitations of the Traditional Web ........................................................... 18
2.2.3.2 Document Representation – HTML and/or XML ..................................... 18
2.2.3.3 Expressing Semantics – RDF & RDF Schemas ........................................ 20

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents