Smart City Gnosys

Smart city article details

Title Spatialssjp: Qos-Aware Adaptive Approximate Stream-Static Spatial Join Processor
ID_Doc 52534
Authors Jawarneh I.M.A.; Bellavista P.; Corradi A.; Foschini L.; Montanari R.
Year 2024
Published IEEE Transactions on Parallel and Distributed Systems, 35, 1
DOI http://dx.doi.org/10.1109/TPDS.2023.3330669
Abstract The widespread adoption of Internet of Things (IoT) motivated the emergence of mixed workloads in smart cities, where fast arriving geo-referenced big data streams are joined with archive tables, aiming at enriching streams with descriptive attributes that enable insightful analytics. Applications are now relying on finding, in real-time, to which geographical regions data streaming tuples belong. This problem requires a computationally intensive stream-static join for joining a dynamic stream with a disk-resident static table. In addition, the time-varying nature of fluctuation in geospatial data arriving online calls for an approximate solution that can trade-off QoS constraints while ensuring that the system survives sudden spikes in data loads. In this paper, we present SpatialSSJP, an adaptive spatial-aware approximate query processing system that specifically focuses on stream-static joins in a way that guarantees achieving an agreed set of Quality-of-Service goals and maintains geo-statistics of stateful online aggregations over stream-static join results. SpatialSSJP employs a state-of-art stratified-like sampling design to select well-balanced representative geospatial data stream samples and serve them to a stream-static geospatial join operator downstream. We implemented a prototype atop Spark Structured Streaming. Our extensive evaluations on big real datasets show that our system can survive and mitigate harsh join workloads and outperform state-of-art baselines by significant magnitudes, without risking rigorous error bounds in terms of the accuracy of the output results. SpatialSSJP achieves a relative accuracy gain against plain Spark joins of approximately 10% in worst cases but reaching up to 50% in best case scenarios. © 1990-2012 IEEE.
Author Keywords Algorithms for data and knowledge management; Apache Spark; Big Data Applications; Data Architecture; Geospatial Analysis; QoS Data Management; Query Processing; Spatial databases and GIS; Spatial Indexes; Spatial Join


Similar Articles


Id Similarity Authors Title Published
52529 View0.87Jawarneh I.M.A.; Bellavista P.; Corradi A.; Foschini L.; Montanari R.Spatially Representative Online Big Data Sampling For Smart CitiesIEEE International Workshop on Computer Aided Modeling and Design of Communication Links and Networks, CAMAD, 2020-September (2020)
53182 View0.866Barbour W.; Wilbur M.; Sandoval R.; Dubey A.; Work D.B.Streaming Computation Algorithms For Spatiotemporal Micromobility Service AvailabilityProceedings - 2020 IEEE Workshop on Design Automation for CPS and IoT, DESTION 2020 (2020)
34458 View0.864Ji H.; Wu G.; Zhao Y.; Wang S.; Wang G.; Yuan G.Y.Jointree: A Novel Join-Oriented Multivariate Operator For Spatio-Temporal Data Management In FlinkGeoInformatica, 27, 1 (2023)
47327 View0.854Cuzzocrea A.Scalable Joins Over Big Data Streams: Actual And Future Research TrendsIEEE International Conference on Data Mining Workshops, ICDMW, 2022-November (2022)
52339 View0.853Li L.; Liu W.; Zhong Z.; Huang C.Sp-Phoenix: A Massive Spatial Point Data Management System Based On PhoenixProceedings - 20th International Conference on High Performance Computing and Communications, 16th International Conference on Smart City and 4th International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2018 (2019)
35474 View0.853Jawarneh I.M.A.; Bellavista P.; Corradi A.; Foschini L.; Montanari R.Locality-Preserving Spatial Partitioning For Geo Big Data Analytics In Main Memory FrameworksProceedings - IEEE Global Communications Conference, GLOBECOM (2020)
22314 View0.85Jawarneh I.M.A.; Foschini L.; Corradi A.Efficient Generation Of Approximate Region-Based Geo-Maps From Big Geotagged DataIEEE International Workshop on Computer Aided Modeling and Design of Communication Links and Networks, CAMAD (2023)