| Title |
Orchestrating Apache Nifi/Minifi Within A Spatial Data Pipeline |
| ID_Doc |
40971 |
| Authors |
Carthen C.; Zaremehrjardi A.; Le V.; Cardillo C.; Strachan S.; Tavakkoli A.; Harris F.C.; Dascalu S.M. |
| Year |
2023 |
| Published |
Proceedings - 2023 IEEE/ACIS 21st International Conference on Software Engineering Research, Management and Applications, SERA 2023 |
| DOI |
http://dx.doi.org/10.1109/SERA57763.2023.10197731 |
| Abstract |
In many smart city projects, a common choice to capture spatial information is the inclusion of LiDAR data, but this decision will often invoke severe growing pains within the existing infrastructure. In this paper, we introduce a data pipeline that orchestrates Apache NiFi (NiFi), Apache MiNiFi (MiNiFi), and several other tools as an automated solution in order to relay and archive LiDAR data captured by deployed edge devices. The LiDAR sensors utilized within this workflow are Velodyne Ultra Pucks sensors that capture at a rate of 10 frames per second and produces 6-7 GB packet capture (PCAP) files per hour. By both compressing the file after capturing it and compressing the file in real-Time, we discovered that gzip produced a file of 5 GB and saved about 5 minutes in transmission time to NiFi, as well as saving considerable CPU time when compressing the file in real-Time. Alternatively, we chose XZ as the compression algorithm for the ingestion of LiDAR data onto an institution compute cluster due to its high compression ratio. In order to evaluate the capabilities of our system design, the features of this data pipeline were compared against existing third-party services, namely Globus and RSync. © 2023 IEEE. |
| Author Keywords |
big data; data pipeline; data transfer; edge computing; iot; LiDAR; minifi; nifi; PCAP; smart city |