Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

portfolio

Portfolio item number 2

Short description of portfolio item number 2

publications

Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving

Published in IEEE Transactions on Information Forensics and Security (TIFS), 2024

We propose stealthy and effective physical adversarial attack methods targeting perception systems in autonomous driving.

Recommended citation: Mingfu Zhou, Wei Zhou, Jie Huang, Jianyuan Yang, Mingxing Du, Qiang Li. (2024). "Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving." IEEE Transactions on Information Forensics and Security. 19, 6795-6809.
Download Paper

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Published in IEEE Transactions on Multimedia (TMM), 2025

RefHCM is a unified framework that integrates a wide range of human-centric referring tasks into a sequence-to-sequence paradigm using a plain encoder-decoder transformer.

Recommended citation: Jie Huang, Ruibing Hou, Jiahe Zhao, Hong Chang, Shiguang Shan. (2025). "RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios." IEEE Transactions on Multimedia.
Download Paper

Revisiting Multimodal Positional Encoding in Vision-Language Models

Published in arXiv preprint, 2025

We propose MHRoPE and MRoPE-I, simple and plug-and-play positional encoding variants that consistently outperform existing approaches in vision-language models.

Recommended citation: Jie Huang, Xuejing Liu, Shijie Song, Ruibing Hou, Hong Chang, Jinlin Lin, Shuai Bai. (2025). "Revisiting Multimodal Positional Encoding in Vision-Language Models." arXiv preprint arXiv:2510.23095.
Download Paper

Qwen3-VL Technical Report

Published in arXiv preprint, 2025

Qwen3-VL is the most capable vision-language model in the Qwen series, supporting interleaved contexts of up to 256K tokens for text, images, and video.

Recommended citation: Shuai Bai, ..., Jie Huang, ..., et al. (2025). "Qwen3-VL Technical Report." arXiv preprint arXiv:2511.21631.
Download Paper

JJJYmmm (Jie Huang)

Sitemap

Pages

Page Not Found

About

Archive Layout with Content

Posts by Category

Posts by Collection

CV

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Markdown Generator

Posts

portfolio

Portfolio item number 2

publications

Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Revisiting Multimodal Positional Encoding in Vision-Language Models

Qwen3-VL Technical Report

talks

teaching