Smart City Gnosys

Smart city article details

Title Improving Time Series Data Quality: Identifying Outliers And Handling Missing Values In A Multilocation Gas And Weather Dataset
ID_Doc 30974
Authors AlSalehy A.S.; Bailey M.
Year 2025
Published Smart Cities, 8, 3
DOI http://dx.doi.org/10.3390/smartcities8030082
Abstract Highlights: What are the main findings? A hybrid rule-based and statistical method significantly improves detection of outliers in time series gas and weather data across multiple locations. An imputation strategy combining temporal and spatial information leads to more reliable handling of missing values. What is the implication of the main finding? The proposed methods enhance the quality of environmental sensor data to be used in descriptive and predictive analytics. This approach can be adapted to other domains with similar challenges in multivariate time series datasets. High-quality data are foundational to reliable environmental monitoring and urban planning in smart cities, yet challenges like missing values and outliers in air pollution and meteorological time series data are critical barriers. This study developed and validated a dual-phase framework to improve data quality using a 60-month gas and weather dataset from Jubail Industrial City, Saudi Arabia, an industrial region. First, outliers were identified via statistical methods like Interquartile Range and Z-Score. Machine learning algorithms like Isolation Forest and Local Outlier Factor were also used, chosen for their robustness to non-normal data distributions, significantly improving subsequent imputation accuracy. Second, missing values in both single and sequential gaps were imputed using linear interpolation, Piecewise Cubic Hermite Interpolating Polynomial (PCHIP), and Akima interpolation. Linear interpolation excelled for short gaps (R2 up to 0.97), and PCHIP and Akima minimized errors in sequential gaps (R2 up to 0.95, lowest MSE). By aligning methods with gap characteristics, the framework handles real-world data complexities, significantly improving time series consistency and reliability. This work demonstrates a significant improvement in data reliability, offering a replicable model for smart cities worldwide. © 2025 by the authors.
Author Keywords air quality monitoring; data imputation; data quality; environmental monitoring; meteorological data; missing values; outliers; PCHIP interpolation; smart cities; time series


Similar Articles


Id Similarity Authors Title Published
17323 View0.917Van Zoest V.; Liu X.; Ngai E.Data Quality Evaluation, Outlier Detection And Missing Data Imputation Methods For Iot In Smart CitiesStudies in Computational Intelligence, 971 (2021)
15167 View0.897Zafeirelli S.; Kavroudakis D.Comparison Of Outlier Detection Approaches In A Smart Cities Sensor Data ContextInternational Journal on Smart Sensing and Intelligent Systems, 17, 1 (2024)
49184 View0.877Garrido-Hidalgo C.; Solmaz G.; Jacobs T.; Roda-Sanchez L.Smart Beestricts: Improving The Spatial Resolution Of Air-Quality Data In Madrid Through Transfer LearningInternational Journal of Geographical Information Science (2025)
37127 View0.861Eid M.M.; Eldahshan K.; Abouali A.H.Missing Data In Smart Cities: An Imputation Algorithm Based On Sine/Cosine Optimization Algorithm2024 International Conference on Computer and Applications, ICCA 2024 (2024)
53203 View0.858Qin X.; Do T.H.; Hofman J.; Rodrigo E.; Panzica V.L.M.; Deligiannis N.; Philips W.Street-Level Air Quality Inference Based On Geographically Context-Aware Random Forest Using Opportunistic Mobile Sensor NetworkACM International Conference Proceeding Series, PartF171546 (2021)
14759 View0.857Rafii F.; Kechadi T.Collection Of Historical Weather Data: Issues With Missing ValuesACM International Conference Proceeding Series (2019)
37126 View0.857Srinivas L.N.B.; Jayavel K.Missing Data Estimation And Imputation Algorithm For Wireless Sensor Network Applications2022 International Conference on Computer Communication and Informatics, ICCCI 2022 (2022)
33702 View0.853Alnowaiser K.; Alarfaj A.A.; Alabdulqader E.A.; Umer M.; Cascone L.; Alankar B.Iot Based Smart Framework To Predict Air Quality In Congested Traffic Areas Using Sv-Cnn Ensemble And Knn Imputation ModelComputers and Electrical Engineering, 118 (2024)
4681 View0.852Bhanja S.; Metia S.; Das A.A Smart City Air Quality Data Imputation Method Using Markov Weights-Based Fuzzy Transfer LearningIETE Journal of Research, 69, 9 (2023)
60731 View0.85Samal K.K.R.Utilizing Deep Learning Techniques For A Multi-Objective Pollution Forecasting Model To Enhance Smart City SustainabilityIntelligent Computing and Emerging Communication Technologies, ICEC 2024 (2024)