66144

Автор(ы): 

Автор(ов): 

2

Параметры публикации

Тип публикации: 

Глава в книге

Название: 

Chapter 8 Building Resilience into the Metadata-Based ETL Process Using Open Source Big Data Technologies

ISBN/ISSN: 

978-3-030-70370-7

DOI: 

10.1007/978-3-030-70370-7_8

Наименование источника: 

  • Resilience in the Digital Age (Lecture Notes in Computer Science)

Город: 

  • Cham, Switzerland

Издательство: 

  • Springer

Год издания: 

2021

Страницы: 

139-153, https://rd.springer.com/book/10.1007/978-3-030-70370-7
Аннотация
Extract-transform-load (ETL) processes play a crucial role in data analysis in real-time data warehouse environments which demand low latency and high availability features for functionality. In essence, ETL- processes are becoming bottlenecks in such environments due to complexity growth, number of steps in data transformations, number of machines used for data processing and finally, increasing impact of human factors on development of new ETL-processes. In order to mitigate this impact and provide resilience of the ETL process, a special Metadata Framework is needed that can manage the design of new data pipelines and processes. In this work, we focus on ETL metadata and its use in driving process execution and present a proprietary approach to the design of the metadata-based process control that can reduce complexity, enhance resilience of ETL processes and allow their adaptive self-reorganization. We present a metadata framework implementation which is based on open-source Big Data technologies, describing its architecture and interconnections with external systems, data model, functions, quality metrics, and templates. A test execution of an experimental Airflow Directed Acyclic Graph (DAG) with randomly selected data is performed to evaluate the proposed framework.

Библиографическая ссылка: 

Сулейкин А.С., Панфилов П. Chapter 8 Building Resilience into the Metadata-Based ETL Process Using Open Source Big Data Technologies / Resilience in the Digital Age (Lecture Notes in Computer Science). Cham, Switzerland: Springer, 2021. С. 139-153, https://rd.springer.com/book/10.1007/978-3-030-70370-7.