Share

Export Citation

APA
MLA
Chicago
Harvard
Vancouver
BIBTEX
RIS
Universitas Hasanuddin
Research output:Contribution to journalArticlepeer-review

Performance analysis of extract, transform, load (ETL) in apache Hadoop atop NAS storage using ISCSI

Adnan

Proceedings of the 2017 4th International Conference on Computer Applications and Information Processing Technology Caipt 2017

Published: 2017Citations: 11

Abstract

Data analytics has become a key element of the business decision process over the last decade. ETL is Process to migrate the data from the source to the required database, Store and process the huge amount of structured and unstructured data for complex analysis business. Standard ETL tools don't efficiently handle it. Improving it can provide a better return on company's investment. Become interesting to find an opportunity to construct computing-storage devices low-cost, low-power components to perform ETL Process. In this paper, we proposed Hadoop on iSCSI over Ethernet adapted Network Attached Storage (NAS) to process ETL, investigate the benefits of running Hadoop over NAS storage as compared with normal HDFS using a benchmark about extract performance, transform performance and load performance. This research used 1 NameNode, 4 DataNodes, NAS Storage, and dataset to examine the proposed architecture. The result showed that the proposed architecture is ability to use low-cost components to deliver scalable performance and could become storage solution in the Big Data space.

Access to Document

10.1109/CAIPT.2017.8320716

Other files and links

Fingerprint

Computer scienceSciences
iSCSISciences
ScalabilitySciences
NoSQLSciences
DatabaseSciences
Operating systemSciences
Benchmark (surveying)Sciences
EthernetSciences
Process (computing)Sciences
Big dataSciences
Key (lock)Sciences
Storage virtualizationSciences
Computer data storageSciences
Storage efficiencySciences
VirtualizationSciences
Cloud computingSciences
GeodesySciences
GeographySciences
SoftwareSciences