CV

Jie Huang

1650675829@qq.com

Beijing, CN

Master student at the Institute of Computing Technology, Chinese Academy of Sciences. Research focuses on Computer Vision and Vision-Language Models.

Computer Science
Present
Institute of Computing Technology, Chinese Academy of Sciences
Cyber Science and Engineering
2024.6
Huazhong University of Science and Technology

Research Intern
2025.4 - 2025.9
Qwen Team, Alibaba Cloud
Core contributor of Qwen3-VL. Participating in multimodal positional encoding research, inference infrastructure, and model release.

Revisiting Multimodal Positional Encoding in Vision-Language Models
2026
Comprehensive analysis of multimodal RoPE in VLMs. Accepted by ICLR 2026.
View Publication
Qwen3-VL Technical Report
2025
The most capable vision-language model in the Qwen series.
View Publication
RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
2025
A unified framework for human-centric referring perception tasks. Accepted by TMM 2025.
View Publication
Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving
2024
Physical adversarial attack methods targeting perception systems in autonomous driving. Accepted by TIFS 2024.
View Publication