WDBench: A Wikidata Graph Query Benchmark

Sattler, Ulrike; Angles, Renzo; Aranda, Carlos Buil; Keet, Maria; Presutti, Valentina; Vrgo?, Domagoj; Almeida, Joao Paulo A.; Takeda, Hideaki; Monnin, Pierre; Pirro, Giuseppe; Amato, Claudia d; Almeida J.P.A.

Abstract

We propose WDBench: a query benchmark for knowledge graphs based on Wikidata, featuring real-world queries extracted from the public query logs of the Wikidata SPARQL endpoint. While a number of benchmarks for graph databases (including SPARQL engines) have been proposed in recent years, few are based on real-world data, even fewer use real-world queries, and fewer still allow for comparing SPARQL engines with (non-SPARQL) graph databases. The raw Wikidata query log contains millions of diverse queries, where it would be prohibitively costly to run all such queries, and difficult to draw conclusions given the mix of features that these queries use. WDBench thus focuses on three main query features that are common to SPARQL and graph databases: (i) basic graph patterns, (ii) optional graph patterns, (iii) path patterns, and (iv) navigational graph patterns. We extract queries from the Wikidata logs specifically to test these patterns, clean them of non-standard features, remove duplicates, classify them into different structural subsets, and present them in two different syntaxes. Using this benchmark, we present and compare performance results for evaluating queries using Blazegraph, Jena/Fuseki, Virtuoso and Neo4j.

Más información

Título según WOS: WDBench: A Wikidata Graph Query Benchmark
Título según SCOPUS: WDBench: A Wikidata Graph Query Benchmark
Título de la Revista: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen: 13489
Editorial: Springer Science and Business Media Deutschland GmbH
Fecha de publicación: 2022
Página final: 731
Idioma: English
DOI:

10.1007/978-3-031-19433-7_41

Notas: ISI, SCOPUS