REQUEST: A Scalable Framework for Interactive Construction of Exploratory QueriesDocUID: 2016-009 Full Text: PDF
Author: Xiaoyu Ge, Yanbing Xue, Zhipeng Luo, Mohamed A. Sharaf, Panos K. Chrysanthis
Abstract: Exploration over large datasets is a key first step in data analysis, as users may be unfamiliar with the underlying database schema and unable to construct precise queries that represent their interests. Such data exploration task usually involves executing numerous ad-hoc queries, which requires a considerable amount of time and human effort. In this paper, we present REQUEST, a novel framework that is designed to minimize the human effort and enable both effective and efficient data exploration. REQUEST supports the query-from-examples style of data exploration by integrating two key components: 1) Data Reduction, and 2) Query Selection. As instances of the REQUEST framework, we propose several highly scalable schemes, which employ active learning techniques and provide different levels of efficiency and effectiveness as guided by the user's preferences. Our results, on real-world datasets from Sloan Digital Sky Survey, show that our schemes on average require 1-2 orders of magnitude fewer feedback questions than the random baseline, and 3-16x fewer questions than the state-of-the-art, while maintaining interactive response time. Moreover, our schemes are able to construct, with high accuracy, queries that are often undetectable by current techniques.
Keywords: Big Data, Data Exploration, Query Formulation, Active Learning
Published In: 2016 IEEE International Conference on Big Data
Year Published: 2016
Project: REQUEST Subject Area: Data Exploration
Publication Type: Conference Paper