Autonomic performance prediction framework for data warehouse queries using lazy learning approach
Abstract
Information is one of the most important assets of an organization. In recent years, the volume of data stored in organizations, varying user requirements, time constraints, and query management complexities have grown exponentially. Due to these problems, the performance modeling of queries in data warehouses (DWs) has assumed a key role in organizations. DWs make relevant information available to decision-makers; however, DW administration is becoming increasingly difficult and time-consuming. DW administrators spend too much time managing queries, which also affects data warehouse performance. To enhance the performance of overloaded data warehouses with varying queries, a prediction-based framework is required that forecasts the behavior of query performance metrics in a DW. In this study, we propose a cluster-based autonomic performance prediction framework using a case-based reasoning approach that determines the performance metrics of the data warehouse in advance by incorporating autonomic computing characteristics. This prediction is helpful for query monitoring and management. For evaluation, we used metrics for precision, recall, accuracy, and relative error rate. The proposed approach is also compared with existing lazy learning techniques. We used the standard TPC-H dataset. Experiments show that our proposed approach produce better results compared to existing techniques.