Welcome to the ADMT Publication Server

Strategies for Detection of Correlated Data Streams

DocUID: 2018-004 Full Text: PDF

Author: Rakan Alseghayer, Daniel Petrov, Panos K. Chrysanthis

Abstract: There is an increasing demand for real-time analysis of large volumes of data streams that are produced at high velocity. The most recent data needs to be processed within a specified delay target in order for the analysis to lead to actionable result. In this paper we present an effective solution for the analysis of such data streams that is based upon a 3-fold approach that combines (1) incremental sliding-window computation of aggregates, to avoid unnecessary recomputations, (2) intelligent scheduling of computation steps and operations, driven by a utility function within a micro-batch, and (3) an exploration strategy that tunes the utility function. Specifically, we propose eight strategies that explore correlated pairs of live data streams across consecutive micro-batches. Our experimental evaluation on a real dataset shows that some strategies are more suitable to identifying high numbers of correlated pairs of live data streams, already known from previous micro-batches, while others are more suitable to identifying previously unseen pairs of live data streams across consecutive micro-batches.

Keywords: Data Streams, Data Exploration, Correlation, Search, Subsequence

Published In: 5th International Workshop on Exploratory Search in Databases and the Web

Pages: 1-6

Place Published: Houston, TX, USA

Year Published: 2018

Note: Co-located with ACM SIGMOD/PODS 2018

Project: Data Exploration Subject Area: Query Processing, Data Streams, Data Exploration

Publication Type: Workshop Paper

Sponsor: NIH U01HL137159

Citation:Text Latex BibTex XML Rakan Alseghayer, Daniel Petrov, and Panos K. Chrysanthis. Strategies for Detection of Correlated Data Streams. 5th International Workshop on Exploratory Search in Databases and the Web. 1-6. 2018. Houston, TX, USA. (Note: Co-located with ACM SIGMOD/PODS 2018).

Similar Publications: