Low-complexity aggregation in GraphLog and Datalog
We present constructs for computing aggregate functions over sets of tuples and along paths in a database graph. We show how Datalog can be extended to compute a large class of queries with aggregates without incurring the large expense of a language with general set manipulation capabilities. In pa...
Saved in:
Published in | Theoretical computer science Vol. 116; no. 1; pp. 95 - 116 |
---|---|
Main Authors | , |
Format | Journal Article Conference Proceeding |
Language | English |
Published |
Amsterdam
Elsevier B.V
02.08.1993
Elsevier |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We present constructs for computing aggregate functions over sets of tuples and along paths in a database graph.
We show how Datalog can be extended to compute a large class of queries with aggregates without incurring the large expense of a language with general set manipulation capabilities. In particular, we aim for queries that can be executed efficiently in parallel, using the class
nc and its various subclasses as formal models of low parallel complexity.
Our approach retains the standard relational notion of relations as sets of tuples, not requiring the introduction of multisets. In the case where no rules are recursive, the language is exactly as expressive as Klug's first-order language with aggregates. We show that this class of nonrecursive programs cannot express transitive closure (unless
logspace = NLOGSPACE) thus providing evidence for a widely believed but never proven folk result. We also study the expressive power and complexity of languages that support aggregation over recursion.
We then describe how these constructs, as well as manipulating the length of paths in database graphs, are incorporated into our visual query language GraphLog. While GraphLog could easily be extended to handle all the queries described above, we prefer to restrict the language in a natural way to avoid explicit recursion; all recursion is expressed as transitive closure. We show that this guarantees that all expressible queries are in
nc. We analyze other proposals and show that they can express queries that are logspace-complete for
p and, thus, unlikely to be parallelizable efficiently. |
---|---|
ISSN: | 0304-3975 1879-2294 |
DOI: | 10.1016/0304-3975(93)90221-E |