69909 | ИПУ РАН

Автор(ы):

Автор(ов):

Параметры публикации

Тип публикации:

Доклад

Название:

COMPARING HDFS – GREENPLUM DATA LOADING OPTIONS

ISBN/ISSN:

978-3-902734-33-4, ISSN 1726-9679

DOI:

10.2507/32nd.daaam.proceedings.101

Наименование конференции:

32ND DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENTMANUFACTURING AND AUTOMATION (Vienna, Austria, 2021)

Наименование источника:

Proceedings of the 32ND DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENTMANUFACTURING AND AUTOMATION (Vienna, Austria, 2021)

Город:

Vienna, Austria

Издательство:

DAAAM International

Год издания:

2021

Страницы:

0724-0731

Аннотация

In the last five years, many companies around the world have been successfully implemented Apache Hadoop as a main Data Lake storage for all data presented in the organization. At the same time, the adoption of other Open-Source technologies has been also increasing over years, such as classical MPP-based systems for Analytical workloads. Thus, the issue of efficient and fast data integration of Apache Hadoop and other organizational data storage systems becomes highly important for enterprises, where business and decision makers require the minimum delay of heterogeneous data exchange between Hadoop and other storages. In this paper, we compare different options for loading data from Apache Hadoop, representing the Data Lake of organization, into Open-Source MPP Greenplum database with the role of classical data warehouse for analytical workloads, and choose the best one. Also, we identify potential risks of using different data loading methods.

Библиографическая ссылка:

Сулейкин А.С., Панфилов П., Бобкова А., Чумаков И. COMPARING HDFS – GREENPLUM DATA LOADING OPTIONS / Proceedings of the 32ND DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENTMANUFACTURING AND AUTOMATION (Vienna, Austria, 2021). Vienna, Austria: DAAAM International, 2021. С. 0724-0731.