RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Published in IEEE Transactions on Multimedia (TMM), 2025

We introduce Referring Human Perceptions and propose RefHCM, a unified framework that employs sequence mergers to convert raw multimodal data into semantic tokens, enabling diverse human-centric referring tasks to be reformulated into a sequence-to-sequence paradigm. RefHCM effectively facilitates knowledge transfer across tasks and exhibits capabilities in handling complex reasoning.

PaperCode

Recommended citation: Jie Huang, Ruibing Hou, Jiahe Zhao, Hong Chang, Shiguang Shan. (2025). "RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios." IEEE Transactions on Multimedia.
Download Paper