Temporal and Interval Data Management

Abstract

In temporal, spatial, and uncertain databases, information is often represented by (multidimensional) intervals. For example, an interval is used to model the validity of a value or an object in a temporal database, or the range of possible values in an uncertain database. Basic query operations such as joins and aggregations are re-defined in this space. We study the evaluation of such operations with the introduction of novel algorithms and optimization techniques. In [1], we take a largely ignored forward scan (FS) based plane sweep algorithm for interval and rectangle joins, which is extremely simple to implement and propose two optimizations of FS that greatly reduce its cost, making it competitive to the state-of-the-art plane sweep algorithm for interval joins. In addition, we show the drawbacks of a previously proposed hash-based partitioning approach for parallel interval join processing and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach we propose a novel breakdown of the partition join jobs into a small number of independent mini-join jobs with varying cost and manage to avoid redundant comparisons. In [2], we study the evaluation of an interval count semi-join (ICSJ) operation that can be used for selecting or ranking intervals based on the number of join pairs they appear in. We extend the state-of-the-art algorithm for interval joins to evaluate ICSJ at the cost of only scanning the sorted interval endpoints.

Publications

[1] P. Bouros and N. Mamoulis, «A Forward Scan based Plane Sweep Algorithm for Parallel Interval Joins,» Proceedings of the VLDB Endowment (PVLDB), 10(11): 1346-1357, August 2017.

[2] P. Bouros and N. Mamoulis, «Interval Count Semi-Joins,» Proceedings of the 21st International Conference on Extending Database Technology (EDBT), pp. 425-428, Vienna, Austria, March 2018.