La lecture à portée de main
Description
Informations
Publié par | Ustli |
Nombre de lectures | 14 |
Langue | English |
Extrait
Tutorial on th e Semantic W eb
Ken B aclawski
Northeas tern Univ ersity
Versatile Information
S ystems
Outline
I. Ontology Lan guages
A. From flat files to hierarchies and XML
B. Rule ba sed sy stems
C. Resource D escription Fra mework
D. Web Ontology Language
II.Onto logy Applications
A. On tology based information retrie val
B. Tr ansformation languages an d t ools
C. Ba yesian We b: Combining logic and probability
D. Si tuation Aw areness
III. Onto logy Design
Flat Fi le Re cords
Conside r the following r ecords in flat fil e:
011500 18.66 0 0 62 46 .271020111 25 .220010
01 26.93 0 1 63 68 .951521001 32.651010
020100 33.95 1 0 65 92.532041101 18.930110
02 17.38 0 0 67 50 .351111100 42.160001
What do th ey me an?
Metadata
The ex planation o f what data means i s cal led
metadata or “ data about data.”
For a flat fi le or datab ase the metadata is
called the sc hema.
NAME LENGTH FORMAT LABEL
instudy 6 MMDDYY Date of randomization into
study
bmi 8 Num Body Mass Index.
obesity 3 0=No 1=Yes Obesity (30.0 <= BMI)
ovrwt 8 0=No 1=Yes Overweight (25 <= BMI < 30)
Height 3 Num Height (inches)
Wtkgs 8 Num Weight (kilograms)
Weight 3 Num Weight (pounds)Record Structures
A flat fi le is a col lection of r ecords.
A record consists of fields.
Each r ecord in a fl at fil e has the same number
and ki nds of fi elds as any other rec ord i n the
same fi le.
The s chema of a flat fi le des cribes the
structure (i.e., the ki nds of fie lds) of each
record.
A sc hema is an ex ample of an ontology.
Self-De scribing Data
<Interv iew Ra ndomizationD ate="20 00-01-15" BMI="18.66" Height="62"... />
<Interv iew Ra ndomizationD ate="20 00-01-15" BMI="26.93" Height="63"... />
<Interv iew Ra ndomizationD ate="20 00-02-01" BMI="33.95 " Height="65"... />
<Interv iew Ra ndomizationD ate="20 00-02-01" BMI="17.38 " Height="67"... />
<ATTLIST Interview
RandomizationDate CDATA #REQUIREDBMI CDATA #IMPLIED
Height CDATA #REQUIRED
>
The e Xtensible Mar kup L anguage
XML is a fo rmat for representing data.
XML goes beyond flat fil es by allowing
elements t o contain other elements, for ming
a hier archy.
XML Flat Files
El ement Record
Attr ibute Field
DTD Sc hema
<bioml>
<organism name="Homo sapiens (human)">
<chromosome name="Chromosome 11" number="11">
<locus name="HUMINS locus">
<reference name="Sequence databases">
<db_entry name="Genbank sequence" entry="v00565“
format="GENBANK"/><db_entry name="EMBL sequence" format="EMBL" entry="V00565"/>
</reference><gene name="Insulin gene">
<dna name="Complete HUMINS sequence" start="1" end="4992">
1 ctcgaggggc ctagacattg ccctccagag agagcaccca acaccctcca ggcttgaccg
...
</dna><ddomain name="flanking domain" start="1" end="2185"/><ddomain name="polymorphic domain" start="1340" end="1823"/>
<ddomain name="Signal peptide" start="2424" end="2495"/>...
<exon name="Exon 1" start="2186" end="2227"/><intron name="Intron 1" start="2228" end="2406"/>
. . .
</gene>
</locus>
</chromosome>
</organism>
</bioml> Element
XML
Hierarchy
Element
Hierarchy
Specifying XML Hier archies
A DTD can spec ify the kinds of element that
can be c ontained i n a n element.
<ELEMENT locus (reference|gene)*>
<ELEMENT reference (db_entry)*>
<ELEMENT gene (dna,ddomain*,(exon|intron)*)>
A locu s ele ment can con tain any number of refere nce an d ge ne el ements.
A refere nce el ement can contain an y number of db_entry el ements.
A gen e el ement mus t conta in a dn a el ement, followed b y any number of
ddomain el ements, followed by an y num ber of exon and intron el ements.