C-T Net: 融合CNN和Transformer的遥感图像变化检测模型

doi:10.3969/j.issn.1000-5641.2025.04.005

摘要/Abstract

摘要：

双时相遥感图像由于拍摄时间、角度和传感器等因素会产生各种伪变化, 同时存在一些不感兴趣的变化, 变化的位置通常与周边其他物体相关, 采用全卷积神经网络会丢失长程信息. 针对这一问题, 提出了一种融合CNN (Convolutional Neural Networks) 和Transformer的网络 (C-T Net)模型. 整体网络结构分为深度特征提取部分和检测头部分: 网络主干部分将CNN和Swin Transformer相结合, 设计融合模块C-to-T、T-to-C以聚合信息; 检测头部分利用Transformer编码、解码, 获得精细化的特征图以进行变化区域的判别. 与多个变化检测模型相比, 在LEVIR-CD数据集和WHU-CD数据集上F₁_1 (90.63%、86.24%) 和$ {p}_{\mathrm{I}\mathrm{o}\mathrm{U}} \_1$(82.87%、75.81%) 均为最高, 在两个数据集上的结果表明, 无论是可视化结果还是数据指标, 与现有的方法相比, 该模型具有一定的优越性.

关键词: 多时相, 变化检测, 卷积神经网络, 转换器, 特征融合

Abstract:

Due to factors such as differences in acquisition time, angle, and sensor characteristics, dual temporal remote sensing images often manifest various pseudo-changes. Moreover, certain changes may have an uninteresting nature and typically correlate with adjacent objects. However, the utilization of a fully convolutional neural network (FCN) may lead to the loss of long-range information. To address this issue, this study proposes a network that integrates convolutional neural networks (CNN) and Transformer (C-T Net), which has an overall network architecture consisting of a deep feature extraction section and a detection head section. The network backbone combines CNN and Swin Transformer. Additionally, two novel fusion modules, C-to-T and T-to-C, are designed to amalgamate local features and global features. The detection head section utilizes Transformer encoding and decoding to derive refined feature maps for discerning change regions. Comparative experiments with multiple change detection models validate the efficacy of C-T Net. On the LEVIR-CD and WHU-CD datasets, the proposed method achieves the highest F₁_1 (90.63%, 86.24%) and $ {p}_{\mathrm{I}\mathrm{o}\mathrm{U}} \_1 $(82.87%, 75.81%). Results across both datasets affirm that our proposed algorithm outperforms existing methodologies from both visual and data-centric perspectives.

Key words: multi-temporal, change detection, convolutional neural networks (CNN), transformer, feature fusion

中图分类号:

TP751.1

武一, 贠世林. C-T Net: 融合CNN和Transformer的遥感图像变化检测模型[J]. 华东师范大学学报（自然科学版）, 2025, 2025(4): 49-60.

Yi WU, Shilin YUN. C-T Net: Remote sensing image change detection model integrating CNN and Transformer[J]. J* E* C* N* U* N* S*, 2025, 2025(4): 49-60.

图/表 12

图1

图2

图3

图4

图5

图6

表1

图7

表2

图8

表3

图9

参考文献 23

1	HAN C X, WU C, GUO H N, et al.. HANet: A hierarchical attention network for change detection with bitemporal very-high-resolution remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023, 16, 3867- 3878.
2	XU J Z, LU W H, LI Z B, et al. Building damage detection in satellite imagery using convolutional neural networks [EB/OL]. (2019-10-14)[2024-12-24]. https://arxiv.org/abs/1910.06444.
3	DE BEM P, DE CARVALHO O Jr, FONTES GUIMARÃES R, et al.. Change detection of deforestation in the Brazilian Amazon using landsat data and convolutional neural networks. Remote Sensing, 2020, 12 (6): 901.
4	CHEN H, SHI Z W.. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sensing, 2020, 12 (10): 1662.
5	JIN S M, YANG L M, DANIELSON P, et al.. A comprehensive change detection method for updating the National Land Cover Database to circa 2011. Remote Sensing of Environment, 2013, 132, 159- 175.
6	LYU Z Y, LIU T F, SHI C, et al.. Novel land cover change detection method based on k-means clustering and adaptive majority voting using bitemporal remote sensing images. IEEE Access, 2019, 7, 34425- 34437.
7	YU C Q, WANG J B, PENG C, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation [C]// Computer Vision – ECCV 2018. Cham: Springer International Publishing, 2018: 334-349.
8	DEVRIES B, DECUYPER M, VERBESSELT J, et al.. Tracking disturbance-regrowth dynamics in tropical forests using structural change detection and Landsat time series. Remote Sensing of Environment, 2015, 169, 320- 334.
9	CAYE DAUDT R, LE SAUX B, BOULCH A. Fully convolutional Siamese networks for change detection [C]// 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018: 4063-4067.
10	SHI W Z, ZHANG M, ZHANG R, et al.. Change detection based on artificial intelligence: State-of-the-art and challenges. Remote Sensing, 2020, 12 (10): 1688.
11	PENG D F, ZHANG Y J, GUAN H Y.. End-to-end change detection for high resolution satellite images using improved UNet++. Remote Sensing, 2019, 11 (11): 1382.
12	DAUDT R C, LE SAUX B, BOULCH A, et al. Urban change detection for multispectral earth observation using convolutional neural networks [C]// IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2018: 2115-2118.
13	VENUGOPAL N.. Automatic semantic segmentation with DeepLab dilated learning network for change detection in remote sensing images. Neural Processing Letters, 2020, 51 (3): 2355- 2377.
14	WANG Y H, GAO L R, HONG D F, et al.. Mask DeepLab: End-to-end image segmentation for change detection in high-resolution remote sensing images. International Journal of Applied Earth Observation and Geoinformation, 2021, 104, 102582.
15	KE Q T, ZHANG P.. CS-HSNet: A cross-Siamese change detection network based on hierarchical-split attention. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2021, 14, 9987- 10002.
16	VENUGOPAL N.. Sample selection based change detection with dilated network learning in remote sensing images. Sensing and Imaging, 2019, 20 (1): 31.
17	CHEN J, YUAN Z Y, PENG J, et al.. DASNet: Dual attentive fully convolutional Siamese networks for change detection in high-resolution satellite images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 14, 1194- 1206.
18	CHEN H, QI Z P, SHI Z W.. Remote sensing image change detection with transformers. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60, 5607514.
19	LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: Hierarchical vision transformer using shifted windows [C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021: 10012-10022.
20	ZHANG C, WANG L J, CHENG S L, et al.. SwinSUNet: Pure transformer network for remote sensing image change detection. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60, 5224713.
21	LIU M X, CHAI Z Q, DENG H J, et al.. A CNN-transformer network with multiscale context aggregation for fine-grained cropland change detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15, 4297- 4306.
22	BAZI Y, BASHMAL L, AL RAHHAL M M, et al.. Vision transformers for remote sensing image classification. Remote Sensing, 2021, 13 (3): 516.
23	JI S P, WEI S Q, LU M.. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57 (1): 574- 586.

模型	+ Swin Transformer	+ C-to-T	+ T-to-C	$ {r}_{\mathrm{O}\mathrm{A}} $/%	$ {p}_{\mathrm{I}\mathrm{o}\mathrm{U}}\_1 $/%	F₁_1/%	P_1/%	R_1/%
Baseline				98.90	80.60	89.20	90.80	89.20
模型1	√			99.00	81.65	90.00	91.60	88.40
模型2	√	√		99.02	82.04	90.10	92.50	87.80
C-T Net	√	√	√	99.06	82.87	90.63	92.04	89.27

方法	$ {r}_{\mathrm{O}\mathrm{A}} $/%	$ {p}_{\mathrm{I}\mathrm{o}\mathrm{U}} \_1$/%	F₁_1/%	P_1/%	R_1/%
FC-Siam-Conc	98.49	71.96	83.69	91.99	76.77
FC-Siam-Diff	98.67	75.92	86.31	89.53	83.31
IFNet	98.87	78.77	88.13	94.02	82.93
SNUNet	98.82	78.83	88.16	89.18	87.17
BIT	98.90	80.60	89.20	90.80	89.20
C-T Net	99.06	82.87	90.63	92.04	89.27

方法	$ {r}_{\mathrm{O}\mathrm{A}} $/%	$ {p}_{\mathrm{I}\mathrm{o}\mathrm{U}}\_1 $/%	F₁_1/%	P_1/%	R_1/%
FC-Siam-Conc	97.04	49.95	66.63	60.88	73.58
FC-Siam-Diff	95.63	41.66	58.81	47.33	77.66
IFNet	98.83	71.52	83.40	96.91	73.19
SNUNet	98.71	71.67	83.50	85.60	81.48
BIT	98.75	72.39	83.98	86.64	81.48
C-T Net	98.88	75.81	86.24	84.74	87.79

[1]	胡雯婧, 蒋龙泉, 余俊龙, 徐伊茜, 刘奇鹏, 梁雷, 李嘉豪. 基于知识蒸馏的轻量化农作物病害识别算法[J]. 华东师范大学学报（自然科学版）, 2025, 2025(1): 59-71.
[2]	姜璐璐, 孙司琦, 邹海东, 陆丽娜, 冯瑞. 基于双视图特征融合的糖尿病视网膜病变分级[J]. 华东师范大学学报（自然科学版）, 2023, 2023(6): 39-48.
[3]	聂素云, 杨彬, 夏微, 张远. 冬小麦多时期冠层含水量遗传优化遥感反演[J]. 华东师范大学学报（自然科学版）, 2023, 2023(3): 71-81.
[4]	吴豪杰, 王妍洁, 蔡文炳, 王飞, 刘洋, 蒲鹏, 林绍辉. 基于隐层相关联算子的知识蒸馏方法[J]. 华东师范大学学报（自然科学版）, 2022, 2022(5): 115-125.
[5]	赵晓臻, 童卫青, 刘咏梅. 一种面向服装样板图分类的图卷积神经网络[J]. 华东师范大学学报（自然科学版）, 2022, 2022(4): 56-66.
[6]	李继洲, 林欣. 基于递归结构的神经网络架构搜索算法[J]. 华东师范大学学报（自然科学版）, 2022, 2022(4): 31-42.
[7]	刘波, 白晓东, 张更新, 沈俊, 谢继东, 赵来定, 洪涛. 深度学习在认知无线电中的应用研究综述[J]. 华东师范大学学报（自然科学版）, 2021, 2021(1): 36-52.
[8]	贺小娟, 郭新顺. 基于特征优化的广告点击率预测模型研究[J]. 华东师范大学学报（自然科学版）, 2020, 2020(4): 147-155.
[9]	袁培森, 张勇, 李美玲, 顾兴健. 基于深度哈希学习的商标图像检索研究[J]. 华东师范大学学报(自然科学版), 2018, 2018(5): 172-182.
[10]	金丽娇, 傅云斌, 董启文. 基于卷积神经网络的自动问答[J]. 华东师范大学学报(自然科学版), 2017, 2017(5): 66-79.