The Berlin SPARQL Benchmark Christian Bizer Andreas Schultz Freie Universität Berlin Indianapolis. August 2 0 , 2008 Christian Bizer: The Berlin SPARQL Benchmark (8/20/ 20 08)
Overview 1. The SPARQL Query Language and Protocol What is SPARQL? 2. Design of the Berlin SPARQL Benchmark The dataset and the query mix 3. Benchmark Experiment and Results Which store is the best one? Christian Bizer: The Berlin SPARQL Benchmark (8/20/ 20 08)
1. The SPARQL Query Language for RDF Flexible query language for the RDF data model W3C Recommendation since 15 January 2008 Example Query PREFIX foaf: SELECT ?name ?mbox WHERE { ?x foaf:name ?name . ?x foaf:mbox ?mbox } Query Result ?name ?mbox "Johnny Lee Outlaw" "Peter Goodguy" Christian Bizer: The Berlin SPARQL Benchmark (8/20/ 20 08)
SPARQL Implementations There are currently 22 SPARQL implementations http://esw.w3.org/topic/ SparqlImplementations Example RDF stores Sesame Jena SDB, TDB OpenLink V irtuoso SemWeb .Net Library Relational database to RDF wrappers D2R Server OpenLink V irtuoso See also: W3C RDB 2 RDF Incubator Group Federated SPARQL Query Engines DARQ The Semantic Discovery System Christian Bizer: The Berlin SPARQL Benchmark (8/20/ 20 08)
The SPARQL Protocol for RDF for sending SPARQL queries to an SPARQL endpoint HTTP Binding SOAP Binding W3C ...
ehTBerlin SPARQL BenchmarkIndianapolis. August 20, 2008ACnhdrriestaisanScBhizueltrzFreie Universität BerlinChristian Bizer: The Berlin SPARQL Benchmark (8/20/2008)
Overview.1.2.3The SPARQL Query Language and ProtocolzWhat is SPARQL?Design of the Berlin SPARQL BenchmarkzThe dataset and the query mixBenchmark Experiment and Results Which store is the best one?hCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
1. The SPARQL Query Language for RDFFlexible query language for the RDF data modelW3C Recommendation since 15 January 2008Example QueryPREFIX foaf: <http://xmlns.com/foaf/0.1/>SELECT ?name ?mboxEREHW{ ?x foaf:name?name .?x foaf:mbox?mbox}Query Result?name?mbox"JohnnyLeeOutlaw"<mailto:jlow@example.com>"PeterGoodguy"<mailto:peter@example.org>hCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
SPARQL ImplementationsThere are currently 22 SPARQL implementationszhttp://esw.w3.org/topic/SparqlImplementationsExample RDF storeszSesamezJena SDB, TDBzOpenLinkVirtuosozSemWeb.Net LibraryRelational database to RDF wrapperszD2R ServerzOpenLinkVirtuosozSee also: W3C RDB 2 RDF Incubator GroupFederated SPARQL Query EnginesQRADzzThe Semantic Discovery System hCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
The SPARQL Protocol for RDFfor sending SPARQL queries to an SPARQL endpointzHTTP BindingzSOAP BindingW3C Recommendation since 15 January 2008Example Requesthttp://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=SELECT+%3Ffilm%0D%0AWHERE+%7B+%3Ffilm+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23subject%3E+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FCategory%3AFrench_films%3E+%7D&format=text%2FhtmlhCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
Public SPARQL Endpoints on the WebThe ESW Wikilists 24 public SPARQL endpointszhttp://esw.w3.org/topic/SparqlEndpointsExampleszBBC Backstage zDBpediazDBtunezDOAP SpacezLinked Movie Data BasezDBLP BibliographyzRevyuzGovTrack.uszProject Gutenberg MetadatazGene Ontology DatabasehCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
2. Design of theBerlin SPARQL Benchmark (BSBM) Christian Bizer: The Berlin SPARQL Benchmark (8/20/2008)
Existing Benchmarks for Semantic Web TechnologiesLehigh University Benchmark (LUBM)zbenchmark for comparing the performance OWL reasoning engineszdoes not test specific SPARQL features like OPTIONALS, UNION, DBpediaBenchmarkzuses DBpediaas benchmark datasetz5 queries that were relevant for DBpediaMobilezvery specific queries, benchmark dataset not scaleableSP2Benchzfrom the databases group at FreiburgUniversity, Germanyzuses an synthetic, scaleable version of the DBLP bibliography datasetzqueries are designed for the comparison of different RDF store layoutszqueries are not designed towards realistic workloadshCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
Design Goals of the BSBM.1.2.3test storage systems with realistic workloads of use case motivated queriesallow the comparison of storage systems across internal data modelsznative RDF storeszrelational database to RDF wrapperszother data sources (OODBsor XML repositories)Do not require complex reasoning but measure query performance against large amounts of RDF datahCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
Benchmark DatasetThe benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.Productszhave varying number of textual and numeric propertieszhave varying number of product featureszwords for product names and product descriptions are randomly chosen from a dictionaryRelations: Products Producers, Products Offers, Offers Vendors, Products Reviewsznormal distributions with different parametersProducts are in a product hierarchyzthe hierarchy grows with dataset sizehCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)
Data Generatorsupports the creation of arbitrarily large datasets using the number of products as scale factor.Output formatszTurtlezN-TriplesLMXzzMySQLdumpThe XML and the relational representation can be used to compare stores across internal data models.The data generations is deterministic.The data generator is published under GPL license.hCirtsainiBez:rhTeeBlrniSAPQRLeBcnmhrak8(2//00280)