| Abstract |
The segmentation of high-resolution remote sensing images remains challenging due to the complex characteristics of objects, including intricate structures, varied shapes, textures, blur, shadows, and more. Traditional U-shaped convolutional neural networks struggle to capture fine contextual details and long-range dependencies in such images, leading to poor boundary detection and loss of detailed information. To address these limitations, an enhanced architecture is proposed, named the Inception Attention Residual U-Net (IARU-Net), which integrates inception modules, attention gates, residual connections, and a multi-scale fusion block. The inception modules enable the network to capture multi-scale contextual features through parallel convolutions with varying receptive fields, preserving spatial information. Attention gates selectively emphasize informative regions while suppressing irrelevant background noise, enhancing the model’s focus on building boundaries and fine structures. Residual connections facilitate gradient flow, mitigate vanishing gradients, and support deeper feature learning with faster and more stable convergence. IARU-Net achieves an accuracy of 95.76%, precision of 91.36%, and recall of 88.85% on the Massachusetts Building dataset, showing improvements of 4.20%, 10.24%, and 9.44%, respectively, over the baseline model. On the WHU Building dataset, it achieves a precision of 95.22%, recall of 95.60%, F1_score of 95.60%, and IoU of 92.29%, reflecting improvements of 3.36%, 4.05%, 3.41%, and 7.09%, respectively. These results highlight the effectiveness of IARU-Net for semantic segmentation of high-resolution remote sensing images. The model’s enhanced performance and ability to delineate complex object boundaries make it highly applicable in real-world scenarios. In urban mapping, it facilitates accurate extraction of building footprints and infrastructure layouts, supporting smart city planning and urban expansion analysis. In disaster management, it enables precise segmentation of damaged buildings and affected areas for rapid damage assessment and resource allocation. In precision agriculture, it aids in identifying field boundaries, monitoring crop health, and detecting anomalies, contributing to informed, data-driven decisions. © The Author(s) 2025. |