Worst-Case-Optimal Similarity Joins on Graph Databases

dc.catalogadorgjm
dc.contributor.authorArroyuelo Billiardi, Diego Gastón
dc.contributor.authorBustos, Benjamin
dc.contributor.authorGómez-Brandón, Adrián
dc.contributor.authorHogan, Aidan
dc.contributor.authorNavarro, Gonzalo
dc.contributor.authorReutter de la Maza, Juan
dc.date.accessioned2024-05-23T17:13:44Z
dc.date.available2024-05-23T17:13:44Z
dc.date.issued2024
dc.description.abstractWe extend the concept of worst-case optimal equijoins in graph databases to the case where some nodes are required to be within the k-nearest neighbors (kNN) of others under some similarity function. We model the problem by superimposing the database graph with the kNN graph and show that a variant of Leapfrog TrieJoin (LTJ) implemented over a compact data structure called the Ring can be seamlessly extended to integrate similarity clauses with the equijoins in the LTJ query process, retaining worst-case optimality in many relevant cases. Our experiments on a benchmark that combines Wikidata and IMGpedia show that our enhanced LTJ algorithm outperforms by a considerable margin a baseline that first applies classic LTJ and then completes the query by applying the similarity predicates. The difference is more pronounced on queries where the similarity clauses are more densely connected to the query, becoming of an order of magnitude in some cases.
dc.format.extent26 páginas
dc.fuente.origenORCID
dc.identifier.doi10.1145/3639294
dc.identifier.urihttps://doi.org/10.1145/3639294
dc.identifier.urihttps://dl.acm.org/doi/10.1145/3639294
dc.identifier.urihttps://repositorio.uc.cl/handle/11534/85770
dc.information.autorucEscuela de Ingeniería; Arroyuelo Billiardi, Diego Gastón; S/I; 1321287
dc.information.autorucEscuela de Ingeniería; Reutter de la Maza, Juan; 0000-0002-2186-0312; 126898
dc.issue.numero1
dc.language.isoen
dc.nota.accesocontenido parcial
dc.revistaProceedings of the ACM on Management of Data
dc.rightsacceso restringido
dc.subjectWorst-case optimal joins
dc.subjectLeapfrog Triejoin
dc.subjectGraph patterns
dc.subjectGraph databases
dc.subjectGraph indexing
dc.subjectSimilarity joins
dc.subjectNearest-neighbor graphs
dc.subject.ddc600
dc.subject.deweyTecnologíaes_ES
dc.titleWorst-Case-Optimal Similarity Joins on Graph Databases
dc.typeartículo
dc.volumen2
sipa.codpersvinculados1321287
sipa.codpersvinculados126898
sipa.trazabilidadORCID;2024-05-20
Files