OAI4-tutorial-simeon-6up
9 pages
English

OAI4-tutorial-simeon-6up

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
9 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

ScheduleOAI-PMH repositories: Quality 9:00 I. Introduction (who we are / scope / objectives / intended outcomes)9:10 II. Brief review of OAI-PMH concepts & terminology (Simeon)issues regarding metadata and • Quick refresher on protocol basics9:30 III. Validation and compliance of an OAI data provider (Simeon)protocol compliance • Common problems / What to watch out for• Validation services• Questions/discussion10:15 BreakTim Cole (University of Illinois at UC) & 10:30 IV. Disseminating shareable metadata (Tim)• What makes for good, shareable metadataSimeon Warner (Cornell University)• Considering service provider expectations• Specific recommended best practices• Questions/discussion11:15 V. Concluding remarks and wrap-up questions & answers (Tim)• Including a review of essential resources, software, tools11:30 CloseOAI4 @ CERN, Geneva, 20 October 2005What you want ... & what you’ll getWho you areGeneral OAI SP and DP information Simeon - IntroductionBest practices for OAI identifiers Simeon - Protocol & resource• 13/24 responses by 2005-10-18T18:00:00ZBest practices for repository implementation, Simeonpitfalls, automatic harvesting• 70% implementing data-provider (45% of thoseUse of XML Simeon - Schemas/encodingwriting one; overall languages: php, python, java,perl) General info on metadata formats (oai_dc, TimMARC, METS)• 70% have experience in metadata creation (of thoseDealing with granularity in IR software packages Policy issues ...

Informations

Publié par
Nombre de lectures 23
Langue English

Extrait

Schedule
OAI-PMH repositories: Quality 9:00 I. Introduction (who we are / scope / objectives / intended outcomes)
9:10 II. Brief review of OAI-PMH concepts & terminology (Simeon)issues regarding metadata and • Quick refresher on protocol basics
9:30 III. Validation and compliance of an OAI data provider (Simeon)protocol compliance • Common problems / What to watch out for
• Validation services
• Questions/discussion
10:15 Break
Tim Cole (University of Illinois at UC) & 10:30 IV. Disseminating shareable metadata (Tim)
• What makes for good, shareable metadataSimeon Warner (Cornell University)
• Considering service provider expectations
• Specific recommended best practices
• Questions/discussion
11:15 V. Concluding remarks and wrap-up questions & answers (Tim)
• Including a review of essential resources, software, tools
11:30 Close
OAI4 @ CERN, Geneva, 20 October 2005
What you want ... & what you’ll getWho you are
General OAI SP and DP information Simeon - Introduction
Best practices for OAI identifiers Simeon - Protocol & resource
• 13/24 responses by 2005-10-18T18:00:00Z
Best practices for repository implementation, Simeon
pitfalls, automatic harvesting• 70% implementing data-provider (45% of those
Use of XML Simeon - Schemas/encodingwriting one; overall languages: php, python, java,
perl) General info on metadata formats (oai_dc, Tim
MARC, METS)
• 70% have experience in metadata creation (of those
Dealing with granularity in IR software packages Policy issues + Tim re metadata100% dc/qdc, 55% other including MARC flavors,
Metadata practices and future trends Tim - Best practices initiativeMETS, MODS, MAB, LOM). Most plan only to use dc
HTML tags in metadata? Tim - In general troublesomein OAI, why?
Hiding records vs expressing rights Tim - Metadata alone no good• 40% have harvesting experience (15% lots)
New developments in metadata standards Tim - Anything with W3C XML
• 84% XML, XSLT and/or W3C Schema experience supported by OAI (esp. DC) schema...
(varying some to lots) Realistic workflow for quality/compliance Ideas but more advanced
How to improve repository Simeon/Tim -- Overall
RDF metadata Tim -- Need XML schema
Service-provider / Data-providerOAI-PMH: A whistle-stop tour
h r
• Just 20 minutes (19 now) so I’ll be brief... OAI-PMH a e
selective harvesting requests:– I’m happy to answer any specific question though r p• datestamp
v • set o• Only talking about v2.0, not 1.x (pre 2002)
e s
s i
• Reference: t t
e ohttp://www.openarchives.org/OAI/2.0/openarchivesprotocol.htm OAI-PMH records
r r• Help:
y
oai-implementers list
provides services exposes metadata
using harvested metadata pertaining to resources
OAI-PMH provides a way for a service-provider to efficiently keep an up-
to-date copy of (some of ) the metadata exposed by a data-provider.
Services can then be built on top of this metadata.Data model: resource-item-record Records and identifiers
• In OAI-PMH a record is uniquely identified
within a repository by
set-membership is an
item-level property identifier + metadataPrefix + datestampresource
• identifier here NOT the identifier of resource
– resource identifier goes in metadata record (Tim)
– pick appropriate scheme to make globally unique
all available metadata item <=> identifier (e.g. oai-identifier, info:)item
about David • metadataPrefix codes for a namespace, only
oai_dc can be assumed to tie globally
• datestamp is UTC time of last update in
Dublin Core MARC DIDL
recordsrecord metadata metadata repository’s granularity (globally meaningful)
record <=> identifier + metadataPrefix + datestamp
oai-identifier Six verbs
•revision of from v1.x
Verb Function
•separate guidelines, both still used with OAI-PMH v2.0 Identify description of repository
metadata
about the•any new use of oai-identifier should use v2.0 ListMetadataFormats metadata formats supported by
repository
<description> repository
<oai-identifier xmlns="http://www.openarchives.org/OAI/2.0/oai-identifier"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ListSets sets defined by repository
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai-identifier
http://www.openarchives.org/OAI/2.0/oai-identifier.xsd">
ListIdentifiers OAI unique ids contained in <scheme>oai</scheme>
<repositoryIdentifier>oai-stuff.foo.org</repositoryIdentifier> harvesting repository
<delimiter>:</delimiter> verbs
<sampleIdentifier>oai::5324</sampleIdentifier> ListRecords listing of N records </oai-identifier>
</description>
GetRecord listing of a single recorddomain based
repository
Most verbs take arguments: datestamps, sets, id, metadata formatidentifiers
and resumption token (for flow control)
Identify ListMetadataFormats
• Arguments • Arguments
– none – identifier (OPTIONAL)
• Errors • Errors
– badArgument - if any argument is given – badArgument - extra or unparsable arguments
– noMetadataFormats - instead of empty reply
– idDoesNotExist - more specific then just
“Tell me about yourself..” badArgument
“What metadata formats do you support? What
internal names correspond to namespaces?”ListSets ListIdentifiers
• Arguments• Arguments
– from (OPTIONAL)
– resumptionToken (EXCLUSIVE)
– until
• Errors – set
– resumptionToken (EXCLUSIVE)– badArgument
– metadataPrefix (REQUIRED)
– badResumptionToken • Errors
– noSetHierarchy – badArgument
– cannotDisseminateFormat
– badResumptionToken“What sets are items organized in, if any? How
– noSetHierarchy
are they identified an described?” – noRecordsMatch
“What records are available in this set/date-
range/metadata format?”
ListRecords GetRecord
• Arguments • Arguments
– from (OPTIONAL)
– identifier (REQUIRED)
– until
– metadataPrefix (REQUIRED)– set
– resumptionToken (EXCLUSIVE) • Errors
– metadataPrefix (REQUIRED)
– badArgument• Errors
– cannotDisseminateFormat– noRecordsMatch
– cannotDisseminateFormat – idDoesNotExist
– badResumptionToken
– noSetHierarchy
– badArgument “Give me this specific record from the given
item in the requested format”
“Give me all the records available in this set/date-
range/metadata format”
Protocol vs periphery OAI-PMH vs HTTP
• Periphery • clear separation of OAI-PMH and HTTP• Protocol
– HTTP – OAI-PMH error handling
– Protocol document • all OK at HTTP level? => 200 OK– XML
• something wrong at OAI-PMH level? => OAI-PMH– Extension schemas– oai_dc error (e.g. badVerb)
– Community
– HTTP codes 302, 503, etc. still available to
guidelines
implementers, but they don’t represent OAI-PMH
events
• (except perhaps in baseURL terminology)Response with no errors Response with error
<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-8"?><OAI-PMH>
<OAI-PMH><responseDate>2002-02-08T08:55:46Z</responseDate>
<responseDate>2002-02-08T08:55:46Z</responseDate><request verb=“GetRecord”… …>http://arXiv.org/oai2</request>
<request>http://arXiv.org/oai2</request> <GetRecord>
<error code=“badVerb”>ShowMe is not a valid OAI-PMH verb</error> <record>
</OAI-PMH> <header>
<identifier>oai:arXiv:cs/0112017</identifier>
<datestamp>2001-12-14</datestamp>
<setSpec>cs</setSpec>
<setSpec>math</setSpec> With errors, only the correctNote no HTTP encoding </header> attributes are echoed in
<metadata> of the OAI-PMH request
<request> …..
</metadata>
</record>
</GetRecord>
</OAI-PMH>
Datestamp and granularity Set membership in header
• all dates/times are UTC, encoded in ISO8601, Z-
The header contains the set membership of item
notation:
<record>
<header> 1999-03-20T20:30:00Z
<identifier>oai:arXiv:cs/0112017</identifier>
<datestamp>2001-12-14</datestamp> or just with year, month, day:
<setSpec>cs</setSpec>
<setSpec>math:FA</setSpec> 1999-03-20
</header>
<metadata>
…• harvesting granularity
</metadata>
– mandatory support of YYYY-MM-DD
</record>
– optional of YYYY-MM-DDThh:mm:ssZ
Super-sets do not need to be included, e.g. no math if math:FA– granularity of from and until must be the same
metadataPrefix and setSpecListIdentifiers
ListIdentifiers returns headers (should really have been called
ListHeaders) • The character set for metadataPrefix and
<?xml version="1.0" encoding="UTF-8"?>
setSpec is the following set of URL-safe<OAI-PMH>
<responseDate>2002-02-08T08:55:46Z</responseDate> characters:
<request verb=“…” …>http://arXiv.org/oai2</request>
<ListIdentifiers>
<header> A-Z a-z 0-9 - _ . ! ~ * ‘ ( )
<identifier>oai:arXiv:hep-th/9801001</identifier>
<datestamp>1999-02-2

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents