Smart City Gnosys

Smart city article details

Title Building Type Classification Using Cnn-Transformer Cross-Encoder Adaptive Learning From Very High Resolution Satellite Images
ID_Doc 13116
Authors Zhang S.; Li M.; Zhao W.; Wang X.; Wu Q.
Year 2025
Published IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 18
DOI http://dx.doi.org/10.1109/JSTARS.2024.3501678
Abstract Building type information indicates the functional properties of buildings and plays a crucial role in smart city development and urban socioeconomic activities. Existing methods for classifying building types often face challenges in accurately distinguishing buildings between types while maintaining well-delineated boundaries, especially in complex urban environments. This study introduces a novel framework, i.e., CNN-Transformer cross-attention feature fusion network (CTCFNet), for building type classification from very high resolution remote sensing images. CTCFNet integrates convolutional neural networks (CNNs) and Transformers using an interactive cross-encoder fusion module that enhances semantic feature learning and improves classification accuracy in complex scenarios. We develop an adaptive collaboration optimization module that applies human visual attention mechanisms to enhance the feature representation of building types and boundaries simultaneously. To address the scarcity of datasets in building type classification, we create two new datasets, i.e., the urban building type (UBT) dataset and the town building type (TBT) dataset, for model evaluation. Extensive experiments on these datasets demonstrate that CTCFNet outperforms popular CNNs, Transformers, and dual-encoder methods in identifying building types across various regions, achieving the highest mean intersection over union of 78.20% and 77.11%, F1 scores of 86.83% and 88.22%, and overall accuracy of 95.07% and 95.73% on the UBT and TBT datasets, respectively. We conclude that CTCFNet effectively addresses the challenges of high interclass similarity and intraclass inconsistency in complex scenes, yielding results with well-delineated building boundaries and accurate building types. © 2008-2012 IEEE.
Author Keywords Building type classification; CNN-transformer networks; cross-encoder; feature interaction; very high resolution remote sensing


Similar Articles


Id Similarity Authors Title Published
4137 View0.888Yang M.; Zhao L.; Ye L.; Jiang H.; Yang Z.A Review Of Convolutional Neural Networks Related Methods For Building Extraction From Remote Sensing Images; [基于卷积神经网络的遥感影像建筑物提取方法综述]Journal of Geo-Information Science, 26, 6 (2024)
47466 View0.867Pan X.; Xu K.; Yang S.; Liu Y.; Zhang R.; He P.Sda-Net: A Spatially Optimized Dual-Stream Network With Adaptive Global Attention For Building Extraction In Multi-Modal Remote Sensing ImagesSensors, 25, 7 (2025)
21839 View0.866Holail S.; Saleh T.; Xiao X.; Zahran M.; Xia G.-S.; Li D.Edge-Cvt: Edge-Informed Cnn And Vision Transformer For Building Change Detection In Satellite ImageryISPRS Journal of Photogrammetry and Remote Sensing, 227 (2025)
13125 View0.866Yuan Q.; Wang N.Buildings Change Detection Using High-Resolution Remote Sensing Images With Self-Attention Knowledge Distillation And Multiscale Change-Aware ModuleInternational Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences - ISPRS Archives, 46, M-2-2022 (2022)
38966 View0.863Zhou Y.; Jiang W.; Wang B.Nesf-Net: Building Roof And Facade Segmentation Based On Neighborhood Relationship Awareness And Scale-Frequency Modulation Network For High-Resolution Remote Sensing ImagesISPRS Journal of Photogrammetry and Remote Sensing, 226 (2025)
55087 View0.861Sun H.; Xu H.; Wei Q.The Classification Method Of Urban Architectural Styles Based On Deep Learning And Street View ImageryAdvances in Transdisciplinary Engineering, 31 (2022)
13075 View0.86Han Z.; Li X.; Wang X.; Wu Z.; Liu J.Building Segmentation In Urban And Rural Areas With Mfa-Net: A Multidimensional Feature Adjustment ApproachSensors, 25, 8 (2025)
5676 View0.858Chatterjee S.; Saha S.; Mahapatra P.R.S.A Two-Stage Cnn Based Satellite Image Analysis Framework For Estimating Building-Count In Residential Built-Up AreaLecture Notes in Networks and Systems, 998 LNNS (2024)
2916 View0.853Yin J.; Wu F.; Qiu Y.; Li A.; Liu C.; Gong X.A Multiscale And Multitask Deep Learning Framework For Automatic Building ExtractionRemote Sensing, 14, 19 (2022)
305 View0.85Lu Z.; Xu T.; Liu K.; Liu Z.; Zhou F.; Liu Q.5M-Building: A Large-Scale High-Resolution Building Dataset With Cnn Based Detection AnalysisProceedings - International Conference on Tools with Artificial Intelligence, ICTAI, 2019-November (2019)