Posts by Collection

portfolio

Portfolio item number 2

Short description of portfolio item number 2

publications

Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving

Published in IEEE Transactions on Information Forensics and Security (TIFS), 2024

We propose stealthy and effective physical adversarial attack methods targeting perception systems in autonomous driving.

Recommended citation: Mingfu Zhou, Wei Zhou, Jie Huang, Jianyuan Yang, Mingxing Du, Qiang Li. (2024). "Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving." IEEE Transactions on Information Forensics and Security. 19, 6795-6809.
Download Paper

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Published in IEEE Transactions on Multimedia (TMM), 2025

RefHCM is a unified framework that integrates a wide range of human-centric referring tasks into a sequence-to-sequence paradigm using a plain encoder-decoder transformer.

Recommended citation: Jie Huang, Ruibing Hou, Jiahe Zhao, Hong Chang, Shiguang Shan. (2025). "RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios." IEEE Transactions on Multimedia.
Download Paper

Revisiting Multimodal Positional Encoding in Vision-Language Models

Published in arXiv preprint, 2025

We propose MHRoPE and MRoPE-I, simple and plug-and-play positional encoding variants that consistently outperform existing approaches in vision-language models.

Recommended citation: Jie Huang, Xuejing Liu, Shijie Song, Ruibing Hou, Hong Chang, Jinlin Lin, Shuai Bai. (2025). "Revisiting Multimodal Positional Encoding in Vision-Language Models." arXiv preprint arXiv:2510.23095.
Download Paper

Qwen3-VL Technical Report

Published in arXiv preprint, 2025

Qwen3-VL is the most capable vision-language model in the Qwen series, supporting interleaved contexts of up to 256K tokens for text, images, and video.

Recommended citation: Shuai Bai, ..., Jie Huang, ..., et al. (2025). "Qwen3-VL Technical Report." arXiv preprint arXiv:2511.21631.
Download Paper

JJJYmmm (Jie Huang)