Author :
Pedro Martins
Affiliation : University of Coimbra
Country : Portugal
Category : Computer Science & Information Technology
Volume, Issue,
Month, Year
: 6, 1, November, 2016
Abstract
In
this paper we investigate the problem of providing scalability to
near-real-time ETL+Q (Extract, transform, load and querying) process of data
warehouses. In general, data loading, transformation and integration are heavy
tasks that are performed only periodically during small fixed time windows. We
propose an approach to enable the automatic scalability and freshness of any
data warehouse and ETL+Q process for near-real-time BigData scenarios. A
general framework for testing the proposed system was implementing, supporting
parallelization solutions for each part of the ETL+Q pipeline. The results show
that the proposed system is capable of handling scalability to provide the
desired processing speed.
Keyword :
Scalability, ETL, freshness, high-rate, performance, parallel
processing, distributed systems, database, load-balance, algorithm
For More Details :
https://airccj.org/CSCP/vol6/csit64818.pdf
No comments:
Post a Comment