| Abstract |
Accurate building extraction is critical for effectively monitoring the spatiotemporal dynamics change of urban development. High-resolution remote sensing images can provide essential data support for the building extraction. However, the existing methods of building extraction using deep learning algorithms commonly encounter some challenges due to the diversity and complexity of buildings in high-resolution images, such as the poor edge regularity and loss of details of building extraction tasks. To address the aforementioned challenges, an extended version of the SamGeo model is proposed in this paper for extracting object masks to achieve coarse-grained segmentation of building objects. Additionally, edge features are extracted to serve as fine-grained supervision for dual-supervision segmentation. Furthermore, an edge-preserving loss function is introduced, which utilizes the edge features extracted by the SamGeo model during cyclic validation to enhance the SegNeXt model's capability in capturing building boundary features.Therefore, a precise building extraction framework based on the generalized SamGeo and SegNeXt models (ESSegNeXt) is proposed, which integrates the generalized SegNeXt model with multi-object segmentation techniques. To evaluate the effectiveness and generalizability of the proposed framework, three experiments were conducted using remote sensing images from the Jilin-1 satellite covering Wuhan City, drone images covering Nanjing City, and the Inria public datasets for testing. The experimental results indicate that the proposed framework significantly improves edge detail preservation and regularity compared to state-of-the-art models including SegNeXt, ConvNeXt, and Mask2Former. Specifically, when using the Jilin-1 remote sensing datasets, the IoU, mFscore, and mIoU were improved by 1.16%, 0.77%, and 1.98%, respectively, while edge preservation performance increased by 16.82%. Additionally, when using the drone datasets, the IoU, mFscore, mIoU, and average edge IoU metrics showed significant improvements of 0.91%, 0.35%, 0.59%, and 12.52%, respectively. On the Inria public datasets, the IoU, mFscore, mIoU, and average edge IoU metrics also showed significant improvements of 0.35%, 0.22%, 0.37%, and 8.03%, respectively. The building edge details can be effectively preserved when used the proposed ESSegNeXt framework, which also offers reliable technical support for building extraction in complex scenes, and it holds significant potential for applications in smart city development, land resource monitoring, etc. © 2008-2012 IEEE. |