Browsing by Author "Arenas, Marcelo"
Now showing 1 - 20 of 21
Results Per Page
Sort Options
- ItemAn extension of SPARQL for RDFS(SPRINGER-VERLAG BERLIN, 2008) Arenas, Marcelo; Gutierrez, Claudio; Perez, Jorge; Christophides, V; Collard, M; Gutierrez, CRDF Schema (RDFS) extends RDF with a schema vocabulary with a predefined semantics. Evaluating queries which involve this vocabulary is challenging, and there is not yet consensus in the Semantic Web community on how to define a query language for RDFS. In this paper, we introduce a language for querying RDFS data. This language is obtained by extending SPARQL with nested regular expressions that allow to navigate through an RDF graph with RDFS vocabulary. This language is expressive enough to answer SPARQL queries involving RDFS vocabulary, by directly traversing the input graph.
- ItemBidirectional Constraints for Exchanging Data: Beyond Monotone Queries(IJCAI-INT JOINT CONF ARTIF INTELL, 2015) Arenas, Marcelo; Dieguez, Gabriel; Perez, Jorge; Yang, Q; Wooldridge, MIn this paper, we propose to use the language of bidirectional constraints to specify schema mappings in the context of data exchange. These constraints impose restrictions over both the source and the target data, and have the potential to minimize the ambiguity in the description of the target data to be materialized. We start by making a case for the usefulness of bidirectional constraints to give a meaningful closed-world semantics for st-tgds, which is motivated by Clark's predicate completion and Reiter's formalization of the closed-world assumption of a logical theory. We then formally study the use of bidirectional constraints in data exchange. In particular, we pinpoint the complexity of the existence-of-solutions and the query evaluation problems in several different scenarios, including in the latter case both monotone and nonmonotone queries.
- ItemComposition and Inversion of Schema Mappings(ASSOC COMPUTING MACHINERY, 2009) Arenas, Marcelo; Perez, Jorge; Reutter, Juan; Riveros, Cristian
- ItemGame-based notions of locality over finite models(2008) Arenas, Marcelo; Barceló, Pablo; Libkin, LeonidLocality notions in logic say that the truth value of a formula can be determined locally, by looking at the isomorphism type of a small neighbourhood of its free variables. Such notions have proved to be useful in many applications. They all, however, refer to isomorphisms of neighbourhoods, which most local logics cannot test. A stronger notion of locality says that the truth value of a formula is determined by what the logic itself can say about that small neighbourhood. Since the expressiveness of many logics can be characterized by games, one can also say that the truth value of a formula is determined by the type, with respect to a game, of that small neighbourhood. Such game-based notions of locality can often be applied when traditional isomorphism-based notions of locality cannot. Our goal is to study game-based notions of locality. We work with an abstract view of games that subsumes games for many logics. We look at three, progressively more complicated locality notions. The easiest requires only very mild conditions on the game and works for most logics of interest. The other notions, based on Hanf's and Gaifman's theorems, require more restrictions. We state those restrictions and give examples of logics that satisfy and fail the respective game-based notions of locality. (c) 2007 Elsevier B.V. All rights reserved.
- ItemInformation systems preface(PERGAMON-ELSEVIER SCIENCE LTD, 2009) Arenas, Marcelo; Schwartzbach, Michael I.
- ItemIs it Possible to Verify if a Transaction is Spendable?(FRONTIERS MEDIA SA, 2021) Arenas, Marcelo; Reisenegger, Thomas; Reutter, Juan; Vrgoc, DomagojWith the popularity of Bitcoin, there is a growing need to understand the functionality, security, and performance of various mechanisms that comprise it. In this paper, we analyze Bitcoin's scripting language, Script, that is one of the main building blocks of Bitcoin transactions. We formally define the semantics of Script, and study the problem of determining whether a user-defined script is well-formed; that is, whether it can be unlocked, or whether it contains errors that would prevent this from happening.
- ItemLocality of queries and transformations (Invited talk)(SPRINGER, 2006) Arenas, Marcelo; Navarro, G; Bertossi, L; Kohayakawa, Y
- Item#NFA Admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes(ASSOC COMPUTING MACHINERY, 2021) Arenas, Marcelo; Alberto Croquevielle, Luis; Jayaram, Rajesh; Riveros, CristianIn this work, we study two simple yet general complexity classes, based on logspace Turing machines, that provide a unifying framework for efficient query evaluation in areas such as information extraction and graph databases, among others. We investigate the complexity of three fundamental algorithmic problems for these classes: enumeration, counting, and uniform generation of solutions, and show that they have several desirable properties in this respect.
- ItemnSPARQL: A Navigational Language for RDF(SPRINGER-VERLAG BERLIN, 2008) Perez, Jorge; Arenas, Marcelo; Gutierrez, Claudio; Sheth, A; Staab, S; Paolucci, M; Maynard, D; Finin, T; Krishnaprasad, TNavigational features have been largely recognized as fundamental for graph database query languages. This fact has motivated several authors to propose RDF query languages with navigational capabilities. In particular, we have argued in a previous paper that nested regular expressions are appropriate to navigate RDF data, and we have proposed the nSPARQL query language for RDF, that uses nested regular expressions as building blocks. In this paper, we study some of the fundamental properties of nSPARQL concerning expressiveness and complexity of evaluation. Regarding expressiveness, we show that nSPARQL is expressive enough to answer queries considering the semantics of the RDFS vocabulary by directly traversing the input graph. We also show that nesting is necessary to obtain this last result, and we study the expressiveness of the combination of nested regular expressions and SPARQL operators. Regarding complexity of evaluation, we prove that the evaluation of a nested regular expression E over an RDF graph G can be computed in time O(vertical bar G vertical bar . vertical bar E vertical bar).
- ItemOBDA: Query Rewriting or Materialization? In Practice, Both!(SPRINGER INTERNATIONAL PUBLISHING AG, 2014) Sequeda, Juan F.; Arenas, Marcelo; Miranker, Daniel P.; Mika, P; Tudorache, T; Bernstein, A; Welty, C; Knoblock, C; Vrandecic, D; Groth, P; Noy, N; Janowicz, K; Goble, CGiven a source relational database, a target OWL ontology and a mapping from the source database to the target ontology, Ontology-Based Data Access (OBDA) concerns answering queries over the target ontology using these three components. This paper presents the development of Ultrawrap(OBDA), an OBDA system comprising bidirectional evaluation; that is, a hybridization of query rewriting and materialization. We observe that by compiling the ontological entailments as mappings, implementing the mappings as SQL views and materializing a subset of the views, the underlying SQL optimizer is able to reduce the execution time of a SPARQL query by rewriting the query in terms of the views specified by the mappings. To the best of our knowledge, this is the first OBDA system supporting ontologies with transitivity by using SQL recursion. Our contributions include: (1) an efficient algorithm to compile ontological entailments as mappings; (2) a proof that every SPARQL query can be rewritten into a SQL query in the context of mappings; (3) a cost model to determine which views to materialize to attain the fastest execution time; and (4) an empirical evaluation comparing with a state-of-the-art OBDA system, which validates the cost model and demonstrates favorable execution times.
- ItemOn the complexity of verifying consistency of XML specifications(SIAM PUBLICATIONS, 2008) Arenas, Marcelo; Fan, Wenfei; Libkin, LeonidXML specifications often consist of a type definition (typically, a document type definition (DTD)) and a set of integrity constraints. It has been shown previously that such specifications can be inconsistent, and thus it is often desirable to check consistency at compile time. It is known [W. Fan and L. Libkin, J. ACM, 49 (2002), pp. 368 - 406] that for general keys, foreign keys, and DTDs the consistency problem is undecidable; however, it becomes NP-complete when all keys are one-attribute (unary) and tractable, if no foreign keys are used. In this paper, we consider a variety of previously studied constraints for XML data and investigate the complexity of the consistency problem. Our main conclusion is that, in the presence of foreign key constraints, compile-time veri. cation of consistency is infeasible. We look at absolute constraints that hold in the entire document and relative constraints that hold only in a part of the document. For absolute constraints, we prove decidability and establish complexity bounds for primary multiattribute keys and unary foreign keys and study unary constraints that involve regular expressions. For relative constraints, we prove that even for unary constraints the consistency problem is undecidable. We also show that results continue to hold for extended DTDs, a more expressive typing mechanism for XML.
- ItemQuerying in the Age of Graph Databases and Knowledge Graphs(ASSOC COMPUTING MACHINERY, 2021) Arenas, Marcelo; Gutierrez, Claudio; Sequeda, Juan F.Graphs have become the best way we know of representing knowledge. The computing community has investigated and developed the support for managing graphs by means of digital technology. Graph databases and knowledge graphs surface as the most successful solutions to this program. This tutorial will provide a conceptual map of the data management tasks underlying these developments, paying particular attention to data models and query languages for graphs.
- ItemQuerying Semantic Data on the Web(ASSOC COMPUTING MACHINERY, 2012) Arenas, Marcelo; Gutierrez, Claudio; Miranker, Daniel P.; Perez, Jorge; Sequeda, Juan F.
- ItemReasoning Web Semantic Technologies for the Web of Data 7th International Summer School 2011 Galway, Ireland, August 23-27, 2011 Tutorial Lectures Preface(SPRINGER-VERLAG BERLIN, 2011) Polleres, Axel; D'Amato, Claudia; Arenas, Marcelo; Handschuh, Siegfried; Kroner, Paula; Ossowski, Sascha; Patel Schneider, Peter; Polleres, A; DAmato, C; Arenas, M; Handschuh, S; Kroner, P; Ossowski, S; PatelSchneider, P
- ItemScalar aggregation in FD-inconsistent databases(2001) Arenas, Marcelo; Bertossi Durán, Leopoldo; Chomicki, Jan; Bussche, Jan Van den; Vianu, Victor
- ItemScreening of COVID-19 cases through a Bayesian network symptoms model and psychophysical olfactory test(CELL PRESS, 2021) Eyheramendy, Susana; Saa, Pedro A.; Undurraga, Eduardo A.; Valencia, Carlos; Lopez, Carolina; Mendez, Luis; Pizarro Berdichevsky, Javier; Finkelstein Kulka, Andres; Solari, Sandra; Salas, Nicolas; Bahamondes, Pedro; Ugarte, Martin; Barcelo, Pablo; Arenas, Marcelo; Agosin, EduardoThe sudden loss of smell is among the earliest and most prevalent symptoms of COVID-19 when measured with a clinical psychophysical test. Research has shown the potential impact of frequent screening for olfactory dysfunction, but existing tests are expensive and time consuming. We developed a low-cost ($0.50/test) rapid psychophysical olfactory test (KOR) for frequent testing and a model-based COVID-19 screening framework using a Bayes Network symptoms model. We trained and validated the model on two samples: suspected COVID-19 cases in five healthcare centers (n = 926; 33% prevalence, 309 RT-PCR confirmed) and healthy miners (n = 1,365; 1.1% prevalence, 15 RT-PCR confirmed). The model predicted COVID-19 status with 76% and 96% accuracy in the healthcare and miners samples, respectively (healthcare: AUC = 0.79 [0.75-0.82], sensitivity: 59%, specificity: 87%; miners: AUC = 0.71 [0.63-0.79], sensitivity: 40%, specificity: 97%, at 0.50 infection probability threshold). Our results highlight the potential for low-cost, frequent, accessible, routine COVID-19 testing to support society's reopening.
- ItemSemFacet: Semantic Faceted Search over Yago(ASSOC COMPUTING MACHINERY, 2014) Arenas, Marcelo; Grau, Bernardo Cuenca; Kharlamov, Evgeny; Marciuska, Sarunas; Zheleznyakov, Dmitriy; Jimenez Ruiz, Ernesto
- ItemThe Complexity of Counting Problems Over Incomplete Databases(ASSOC COMPUTING MACHINERY, 2021) Arenas, Marcelo; BarcelO, Pablo; Monet, MikaelWe study the complexity of various fundamental counting problems that arise in the context of incomplete databases, i.e., relational databases that can contain unknown values in the form of labeled nulls. Specifically, we assume that the domains of these unknown values are finite and, for a Boolean query q, we consider the following two problems: Given as input an incomplete database D, (a) return the number of completions of D that satisfy q; or (b) return the number of valuations of the nulls of D yielding a completion that satisfies q. We obtain dichotomies between #P-hardness and polynomial-time computability for these problems when q is a self-join-free conjunctive query and study the impact on the complexity of the following two restrictions: (1) every null occurs at most once in D (what is called Codd tables); and (2) the domain of each null is the same. Roughly speaking, we show that counting completions is much harder than counting valuations: For instance, while the latter is always in #P, we prove that the former is not in #P under some widely believed theoretical complexity assumption. Moreover, we find that both (1) and (2) can reduce the complexity of our problems. We also study the approximability of these problems and show that, while counting valuations always has a fully polynomial-time randomized approximation scheme (FPRAS), in most cases counting completions does not. Finally, we consider more expressive query languages and situate our problems with respect to known complexity classes.
- ItemThe Tractability of SHAP-Score-Based Explanations over Deterministic and Decomposable Boolean Circuits(ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, 2021) Arenas, Marcelo; Barcelo, Pablo; Bertossi, Leopoldo; Monet, MikaelScores based on Shapley values are widely used for providing explanations to classification results over machine learning models. A prime example of this is the influential SHAP - score, a version of the Shapley value that can help explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is a computationally intractable problem, it has recently been claimed that the SHAP-score can be computed in polynomial time over the class of decision trees. In this paper, we provide a proof of a stronger result over Boolean models: the SHAP -score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits, also known as tractable Boolean circuits, generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees, Ordered Binary Decision Diagrams (OBDDs) and Free Binary Decision Diagrams (FBDDs). We also establish the computational limits of the notion of SHAP-score by observing that, under a mild condition, computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider, as removing one or the other renders the problem of computing the SHAP-score intractable (namely, #P-hard).
- ItemTowards Semantic Faceted Search(ASSOC COMPUTING MACHINERY, 2014) Arenas, Marcelo; Grau, Bernardo Cuenca; Kharlamov, Evgeny; Marciuska, Sarunas; Zheleznyakov, DmitriyIn this paper we present limitations of conventional faceted search in the way data, facets, and queries are modelled. We discuss how these limitations can be addressed with Semantic Web technologies such as RDF, OWL 2, and SPARQL 1.1. We also present a system, SemFacet, that is a proof-of-concept prototype of our approach implemented on top of Yago knowledge base, powered by the OWL 2 RL triple store RDFox, and the full text search engine Lucene.