Education

Wuhan University of Technology
M. E. in Computer Science and Technology, advised by Pengfei Duan and Shengwu Xiong.
Fall 2023 - Now
Wuhan University of Technology
B. E. in Data Science and Big Data technology, advised by Pengfei Duan.
Fall 2019 - Spring 2023

Employment

Chinasoft International
Big Data Engineering Intern. Learn big data development technology and complete the development of book big data management system.
July 2022 - Aug 2022
Wuhan, CHINA

Publications

Scene Knowledge Enhanced Multimodal Retrieval Model for Dense Video Captioning Introducing a Memory Enhanced Visual-Speech Aggregation model for dense video captioning, inspired by cognitive informatics on human memory recall. The model enhances visual representations by merging them with relevant text features retrieved from a memory bank through multimodal retrieval involving transcribed speech and visual inputs.
2025 Twenty-first International Conference on Intelligent Computing (ICIC 2025)
LDIT: Pseudo-Label Noise Adaptation via Label Diffusion Transformer
J. Peng, P. Duan, M. Huang, S. Xiong
We reformulate label prediction as a progressive refinement process starting from an initial random guess, and propose LDiT (Label Diffusion Transformer) for pseudo-label noise adaptation. By modeling label uncertainty through a diffusion process, LDiT enables more robust learning under noisy supervision. In addition, to effectively capture the long-range dependencies in textual data, we adopt a Transformer-based latent denoising architecture with self-attention mechanisms.
2025 Twenty-first International Conference on Intelligent Computing (ICIC 2025)
ST-CLIP: Spatio-Temporal enhanced CLIP towards Dense Video Captioning Proposing a new factorized spatio-temporal self-attention paradigm to address inaccurate event descriptions caused by insufficient temporal relationship modeling between video frames and apply it to dense video captioning tasks.
2024 Twentieth International Conference on Intelligent Computing (ICIC 2024)