Monday, February 24, 2020

NEAR-REAL-TIME PARALLEL ETL+Q FOR AUTOMATIC SCALABILITY IN BIGDATA


Author :  Pedro Martins

Affiliation :  University of Coimbra

Country :  Portugal

Category :  Computer Science & Information Technology

Volume, Issue, Month, Year :  6, 1, November, 2016

Abstract

In this paper we investigate the problem of providing scalability to near-real-time ETL+Q (Extract, transform, load and querying) process of data warehouses. In general, data loading, transformation and integration are heavy tasks that are performed only periodically during small fixed time windows. We propose an approach to enable the automatic scalability and freshness of any data warehouse and ETL+Q process for near-real-time BigData scenarios. A general framework for testing the proposed system was implementing, supporting parallelization solutions for each part of the ETL+Q pipeline. The results show that the proposed system is capable of handling scalability to provide the desired processing speed.

Keyword :  Scalability, ETL, freshness, high-rate, performance, parallel processing, distributed systems, database, load-balance, algorithm

For More Details  :  https://airccj.org/CSCP/vol6/csit64818.pdf


No comments:

Post a Comment