Welcome to the ADMT Publication Server

Detection of Highly Correlated Live Data Streams

DocUID: 2017-008 Full Text: PDF

Author: Rakan Alseghayer, Daniel Petrov, Panos K. Chrysanthis, Mohamed A. Sharaf, Alexandros Labrinidis

Abstract: More and more organizations (commercial, health, government and security) currently base their decisions on real-time analysis of fast arriving, large volumes of data streams. For such analysis to lead to actionable information in real-time and at the right time, the most recent data needs to be processed within a speci€ed delay target. E‚ective solutions for analysis of such data streams rely on two techniques, (1) incremental sliding-window computation of aggregates, to avoid unnecessary recomputations and (2) intelligent scheduling of computational steps and operations. In this paper, we propose a solution that combines both of these techniques to €nd highly correlated data streams in real-time, using the Pearson Correlation Coecient as a correlation metric for two windows of data streams. Speci€cally, we propose to partition a set of data streams into micro-batches that capture the delay target, use sliding windows within a range as the subsequences of values exhibiting a certain level of correlation, utilize the idea of sucient statistics to incrementally compute the Pearson Correlation Coecient of pairs of sliding windows, and adopt a deadline-aware priority scheduling to detect the highly correlated pairs of data streams. Our experimental results show that our scheme and in particular our Price-DCS with warm start scheduling algorithm outperform existing ones and enable high degree of interactivity in correlating live data streams micro-batches.

Keywords: data streams, data exploration, correlation, search, subsequence

Published In: Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics

Pages: 3.1-3.8

Year Published: 2017

DOI: 10.1145/3129292.3129298

Project: STREAMS Subject Area: Data Streams

Publication Type: Workshop Paper

Sponsor: Others

Citation:Text Latex BibTex XML Rakan Alseghayer, Daniel Petrov, Panos K. Chrysanthis, Mohamed A. Sharaf, and Alexandros Labrinidis. Detection of Highly Correlated Live Data Streams. Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics. 3.1-3.8. 2017. DOI: 10.1145/3129292.3129298.