CV

Jie Huang

1650675829@qq.com
Beijing, CN

Summary

Master student at the Institute of Computing Technology, Chinese Academy of Sciences. Research focuses on Computer Vision and Vision-Language Models.

Education

  • Computer Science
    Present
    Institute of Computing Technology, Chinese Academy of Sciences
  • Cyber Science and Engineering
    2024.6
    Huazhong University of Science and Technology

Work Experience

  • Research Intern
    2025.4 - 2025.9
    Qwen Team, Alibaba Cloud
    Core contributor of Qwen3-VL. Participating in multimodal positional encoding research, inference infrastructure, and model release.

Publications

  • Revisiting Multimodal Positional Encoding in Vision-Language Models
    2026
    Comprehensive analysis of multimodal RoPE in VLMs. Accepted by ICLR 2026.
  • Qwen3-VL Technical Report
    2025
    The most capable vision-language model in the Qwen series.
  • RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
    2025
    A unified framework for human-centric referring perception tasks. Accepted by TMM 2025.
  • Stealthy and Effective Physical Adversarial Attacks in Autonomous Driving
    2024
    Physical adversarial attack methods targeting perception systems in autonomous driving. Accepted by TIFS 2024.