DATA PIPELINE ARCHITECTURE FOR ACADEMIC INFORMATION SYSTEM AT AKADEMI TEKNIK BIAK

Heman Koreri Israel Mnsen, Bambang Purnomosidi, Rikie Kartadie, Didi Kurnaedi

Abstract


In development a information system Intergrated, Architecture planning is the first step must be established. The planning of development in a information system is needed in order to a system can be running according to necessity. The data is used for this research, that is internal data of Biak Technical Academy College and external data of Institution of high education service at IV area in Biak Papua. The main goal of this research is design architecture pipelines data of ATB college. The architecture of pipelines is used for carrying resources of big data from one area to the other area in far distance to be efficiency. The method is used for this research, that is Estract Transform Load (ETL). The process of estract data is needed a special supporting library on apache spark in using library spark session. This spark session is established in order to call data of Biak Technical Academy college with csv extension can be run on apache spark. After the process of estract is established, apache spark will read data with csv extension and establish transform data. The process of transform data csv extension will be loaded in to a frame data as a output of processing ETL The result of research is apache spark technology can be easy for writers in design process information system of Biak Technical Academy and to be one of the best solution in processing Estract Load Transform (ETL) data with the big scale and real-time

Full Text:

PDF

References


Aminudin, A. and Cahyono, E.B. (2019) ‘Pengukuran Performa Apache Spark dengan Library H2O Menggunakan Benchmark Hibench Berbasis Cloud Computing’, Jurnal Teknologi Informasi dan Ilmu Komputer, 6(5), p. 519. Available at: https://doi.org/10.25126/jtiik.2019651520.

Belcastro, L. et al. (2022) Programming big data analysis: principles and solutions, Journal of Big Data. Springer International Publishing. Available at: https://doi.org/10.1186/s40537-021-00555-2.

Cottur, K. and Gadad, V. (2020) ‘Design and Development of Data Pipelines’, International Research Journal of Engineering and Technology, (May), pp. 2715–2718.

Fauzi, R.A., Cholissodin, I. and Rahayudi, B. (2021) ‘Pemanfaatan Spark untuk Analisis Sentimen Mengenai Netralitas Berita dalam Membahas Pemilu Presiden 2019 Menggunakan Metode Naïve Bayes Classifier’, Jurnal PengembaSyarifuddin, M. (2020). Analisis Sentimen Opini Publik Mengenai Covid-19 Pada Twitter Menggunakan Metode Naïve Bayes Dan Knn. Inti Nusa Mandiri, 15(1), 23–28.ngan Teknologi Informasi dan Ilmu Komputer, 5(3), pp. 1070–1077.

Hesse, G. et al. (2019) ‘Quantitative impact evaluation of an abstraction layer for data stream processing systems’, Proceedings - International Conference on Distributed Computing Systems, 2019-July, pp. 1381–1392. Available at: https://doi.org/10.1109/ICDCS.2019.00138.

Nawawi, M. and Rubedo, H. (2021) ‘Sistem Informasi Pengelolaan Data Aktivitas Penelitian dan PKM Dosen Universitas Wanita Internasional’, Jurnal Manajemen Informatika (JAMIKA), 11(1), pp. 37–46. Available at: https://doi.org/10.34010/jamika.v11i1.3963.

Oliviandi, S., Osmond, A.B. and Latuconsina, R. (2018) ‘Implementasi Apache Spark Pada Big Data Berbasis Hadoop Distributed File System’, e-Proceeding of Engineering, 5(1 Maret), pp. 1005–1012.

Pogiatzis, A. and Samakovitis, G. (2021) ‘An event-driven serverless etl pipeline on aws’, Applied Sciences (Switzerland), 11(1), pp. 1–13. Available at: https://doi.org/10.3390/app11010191.

Rosianti, F., Bhawiyuga, A. and Amron, K. (2020) ‘Pengembangan Platform Pengolahan Data Sensor Internet of Things berjenis Streaming dengan Komputasi Terdistribusi menggunakan Spark Streaming’, 4(7), pp. 2102–2110.

Science, C. and Science, C. (2023) ‘Developing an ETL Pipeline for Data Analysis’, International Journal of Computer Applications Technology and Research, 11(08), pp. 315–319. Available at: https://doi.org/10.7753/ijcatr1108.1004.

Thenata, A.P. (2020) ‘Data Pipeline Architecture with Near Real-Time Streaming Multiple Source Indonesian Online News Data Lake’, JISA(Jurnal Informatika dan Sains), 3(1), pp. 32–37. Available at: https://doi.org/10.31326/jisa.v3i1.657.




DOI: http://dx.doi.org/10.26798/jiss.v3i1.1335

Article Metrics

Abstract view : 493 times
PDF - 170 times

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Heman Koreri Israel Mnsen, Bambang Purnomosidi, Rikie Kartadie, Didi Kurnaedi


JOURNAL OF INTELLIGENT SOFTWARE SYSTEMS

Published by

Magister Teknologi Informasi
Lembaga Penelitian dan Pengabdian Masyarakat

Universitas Teknologi Digital Indonesia (d.h STMIK AKAKOM)
Jl. Raya Janti Jl. Majapahit No.143, Jaranan, Banguntapan,
Kec. Banguntapan, Kabupaten Bantul,
Daerah Istimewa Yogyakarta 55918

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.