建筑内基于虚拟仿真的跨视域行人监测研究

doi:10.16265/j.cnki.issn1003-3033.2025.05.0701

摘要/Abstract

摘要：

为解决高层建筑楼宇或复杂开放建筑环境中多路视频数据采集成本高、长时间高质量标注难等问题,实现跨视域多路视频数据的生成与行人图像的自动标注,首先,设计虚拟现实场景,模拟行人运动并自动获取标记数据;其次,研究无监督领域自适应方法,降低源域数据与目标域数据的特征分布差异,使模型泛化至目标建筑场景;最后,验证模型泛化能力。研究结果表明:构建的虚拟现实场景能有效克服跨视域视频数据采集与高质量标注的困难;无监督领域自适应方法将平均首位命中率从22.02%提升至45.48%;结合源域风格转换、数据增广和目标域伪标签生成,首位命中率提升20%,降低了分布偏差,有助于实现模型在不同建筑场景中的泛化。

关键词: 建筑场景, 虚拟仿真, 跨视域, 行人运动, 自动标注

Abstract:

In order to solve the problems of the high cost of multi-channel video data collection and long-term high-quality annotation in high-rise buildings or complex open building environments, the generation of multi-channel video data across the field of view and the automatic annotation of pedestrian images was realized. Firstly, a virtual reality scene was designed to simulate pedestrian movement and automatically obtain marker data. Secondly, unsupervised domain adaptation methods were researched to reduce the difference in feature distribution between source and target domain data, enabling the model to generalize to the target building scene. Finally, the model's generalization ability was verified. Results show that the constructed virtual reality scene effectively overcomes the difficulties of cross-visual video data collection and high-quality annotation. The unsupervised domain adaptation method increased the average first hit rate from 22.02% to 45.48%. By combining source domain style conversion, data augmentation, and target domain pseudo label generation, the first hit rate has been increased by 20%, reducing distribution bias and achieving generalization of the model in different building scenarios.

Key words: building scenes, virtual simulation, cross-visual, personnel movement, automatic annotation

中图分类号:

X924.3安全监控系统

陶振翔, 李滢, 黄绪勃, 王一森, 张平, 杨锐. 建筑内基于虚拟仿真的跨视域行人监测研究[J]. 中国安全科学学报, 2025, 35(5): 161-168.

TAO Zhenxiang, LI Ying, HUANG Xubo, WANG Yisen, ZHANG Ping, YANG Rui. Research on cross-visual pedestrian monitoring based on virtual simulation in buildings[J]. China Safety Science Journal, 2025, 35(5): 161-168.

图/表 15

图1

表1

图2

图3

图4

图5

图6

图7

表2

图8

表3

表4

表5

图9

图10

参考文献 20

[1]	ZHOU Bolei, TANG Xiaoou, WANG Xiaogang. Learning collective crowd behaviors with dynamic pedestrian-agents[J]. International Journal of Computer Vision, 2015, 111:50-58.
[2]	TOMASTIK R, LIN Yiqing, BANASZUK A. Video-based estimation of building occupancy during emergency egress[C]. 2008 American Control Conference. IEEE, 2008:894-901.
[3]	马亚萍, 吴楠, 高远, 等. 基于分布式建筑控制策略的人员疏散系统[J]. 清华大学学报:自然科学版, 2015, 55(8):927-932.
	MA Yaping, WU Nan, GAO Yuan, et al. Building fire smoke control strategies simulation using fire and evacuation information[J]. Journal of Tsinghua University:Natural Science Edition, 2015, 55(8):927-932.
[4]	CHUO Y H, SHEU R K, CHEN Lunchi. Design and implementation of a cross-camera suspect tracking system[C]. 2019 International Automatic Control Conference (CACS), IEEE, 2019:1-6.
[5]	CHENG De, GONG Yihong, WANG Jinjun, et al. Part-aware trajectories association across non-overlapping uncalibrated cameras[J]. Neurocomputing, 2017, 230:30-39.
[6]	KOHL P, SPECKER A, SCHUMANN A, et al. The mta dataset for multi-target multi-camera pedestrian tracking by weighted distance aggregation[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020: 4 489-4 498.
[7]	张建平, 何田丰, 林佳瑞, 等. 基于BIM的建筑空间与设备拓扑信息提取及应用[J]. 清华大学学报:自然科学版, 2018, 58(6):587-592.
	ZHANG Jianping, HE Tianfeng, LIN Jiarui, et al. Building space and equipment topology information extraction and application based on BIM[J]. Journal of Tsinghua University:Natural Science Edition, 2018, 58(6):587-592.
[8]	YE Mang, SHEN Jianbing, LIN Gaojie, et al. Deep learning for person re-ide.pngication: a survey and outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 44(6): 2872-2893.
[9]	杜培军, 林聪, 陈宇, 等. 多时相遥感影像样本迁移模型与地表覆盖智能分类[J]. 同济大学学报:自然科学版, 2022, 50(7):955-966.
	DU Peijun, LIN Cong, CHEN Yu, et al. Training sample transfer learning from multi-temporal remote sensing images for dynamic and intelligent land cover classification[J]. Journal of Tongji University:Natural Science, 2022, 50(7):955-966.
[10]	陈述, 孙孟文, 陈云, 等. 基于无监督LDA的水电工程施工安全事故致因分析[J]. 中国安全科学学报, 2023, 33(10): 79-85. doi: 10.16265/j.cnki.issn1003-3033.2023.10.1924
	CHEN Shu, SUN Mengwen, CHEN Yun, et al. Causal analysis of construction safety accidents in hydropower projects based on unsupervised LDA[J]. China Safety Science Journal, 2023, 33(10): 79-85.
[11]	LI Jianing, ZHANG Shiliang. Joint visual and temporal consistency for unsupervised domain adaptive person re-ide.pngication[C]. Computer Vision-ECCV 2020: 16^th European Conference, 2020: 483-499.
[12]	ZHAI Yunpeng, LU Shijian, YE Qixiang, et al. Ad-cluster: augmented discriminative clustering for domain adaptive person re-ide.pngication[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 9021-9030.
[13]	LI Hui, XIAO Jimin, SUN Mingjie, et al. Progressive sample mining and representation learning for one-shot person re-ide.pngication[J]. Pattern Recognition, 2021, 110: DOI:10.1016/j.patcog.2020.107614.
[14]	LLOYD S. Least squares quantization in PCM[J]. IEEE Transactions on Information Theory, 1982, 28(2): 129-137.
[15]	ESTER M, KRIEGEL H P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]. KDD-96 Proceedings, 1996, 96(34): 226-231.
[16]	邓社军, 虞宇浩, 张俊林, 等. 基于行人恐慌情绪解析的改进社会力模型[J]. 中国安全科学学报, 2024, 34(2):45-52. doi: 10.16265/j.cnki.issn1003-3033.2024.02.1277
	DENG Shejun, YU Yuhao, ZHANG Junlin, et al. Improved social force model based on analysis of pedestrian panic emotion[J]. China Safety Science Journal, 2024, 34(2):45-52. doi: 10.16265/j.cnki.issn1003-3033.2024.02.1277
[17]	QIU Weichao, YUILLE A. Unrealcv: connecting computer vision to unreal engine[C]. Computer Vision-ECCV 2016 Workshop, 2016: 909-916.
[18]	叶晨, 关玮. 生成式对抗网络的应用综述[J]. 同济大学学报:自然科学版2020, 48(4):591-601.
	YE Chen, GUAN Wei. Training sample transfer learning from multi-temporal remote sensing images for dynamic and intelligent land cover classification[J]. Journal of Tongji University:Natural Science, 2020, 48(4):591-601.
[19]	FAN Hehe, ZHENG Liang, YAN Chenggang, et al. Unsupervised person re-ide.pngication: clustering and fine-tuning[J]. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2018, 14(4): 1-18.
[20]	马博渊, 周佳城, 班晓娟, 等. 基于视觉检测的非接触式膏体浓度识别方法[J]. 中南大学学报:自然科学版, 2023, 54(5):1942-1953.
	MA Boyuan, ZHOU Jiacheng, BAN Xiaojuan, et al. Non-contact recognition method of paste concentration based on visual inspection[J]. Journal of Central South University:Science and Technology, 2023, 54(5):1942-1953.

名称	参数设置
建筑楼层数	11
室内房间数/个	176
建筑出入口位置	一楼进出口区域
建筑行人到达率/(人·min^-1)	30
行人初始运动速率/(m·s^-1)	1~1.5
行人舒适运动速度/(m·s^-1)	1~1.5
行人目标房间数/个	1~11
单个房间停留时长/h	0~2

试验编号	人数及图片数量			类别	累计匹配命中率
	训练数据集		测试数据集		首位	前5位	前10位	前15位
	虚拟现实数据集	目标域训练数据集	测试数据集		首位	前5位	前10位	前15位
试验2 A2	225人/64 951张	104人/19 417张	24人/4 271张	增广前	0.791	0.896	0.924	0.939
试验2 A2	225人/64 951张	104人/19 417张	24人/4 271张	增广后	0.877	0.941	0.957	0.967
试验2 B2	225人/64 951张	112人/34 515张	30人/6 082张	增广前	0.702	0.857	0.897	0.915
试验2 B2	225人/64 951张	112人/34 515张	30人/6 082张	增广后	0.801	0.907	0.935	0.951
试验2 C2	225人/64 951张	131人/39 281张	28人/4 620张	增广前	0.780	0.944	0.963	0.969
试验2 C2	225人/64 951张	131人/39 281张	28人/4 620张	增广后	0.862	0.951	0.966	0.973
试验2 D2	225人/64 951张	111人/31 183张	27人/4 110张	增广前	0.856	0.928	0.948	0.960
试验2 D2	225人/64 951张	111人/31 183张	27人/4 110张	增广后	0.897	0.950	0.965	0.974

试验编号	人数及图片数量
试验编号	源域数据虚拟建筑场景	目标域数据真实建筑场景
试验3 A3	225人/64 951张	24人/4 271张
试验3 B3	225人/64 951张	30人/6 082张
试验3 C3	225人/64 951张	28人/4 620张
试验3 D3	225人/64 951张	27人/4 110张

试验编号	类别	累积匹配命中率
试验编号	类别	首位	前5位	前10位	前15位
试验3 A3	基准值	0.285	0.439	0.518	0.575
	源域数据风格转换	0.517	0.707	0.782	0.818
	源域数据增广	0.564	0.731	0.800	0.838
	目标伪标签生产	0.586	0.706	0.760	0.799
试验3 B3	基准值	0.102	0.214	0.285	0.334
	源域数据风格转换	0.377	0.595	0.681	0.736
	源域数据增广	0.439	0.632	0.714	0.759
	目标伪标签生产	0.441	0.636	0.724	0.767
试验3 C3	基准值	0.167	0.360	0.451	0.517
	源域数据风格转换	0.426	0.709	0.821	0.862
	源域数据增广	0.480	0.727	0.816	0.859
	目标伪标签生产	0.524	0.732	0.807	0.842
试验3 D3	基准值	0.328	0.456	0.531	0.576
	源域数据风格转换	0.503	0.653	0.726	0.765
	源域数据增广	0.504	0.641	0.708	0.751
	目标伪标签生产	0.522	0.676	0.745	0.782