Biography
I am a PhD student of MMLab@CUHK advised by Prof. Xiangyu Yue. Recently, I focus on efficient and unified multimodal LLMs, such as LLaMA-Adpater, OneLLM and Tar. I received my Master and Bachelor degree from Wuhan University and Central South University, respectively. I interned at Bytedance Seed, Tencent AI Lab, Shanghai AI Lab and Tencent YouTu Lab.
More about me: Email | Google Scholar | Github | Curriculum Vitae
News
- 08/2025: Reflective Planning is accepted by CoRL 2025.
- 02/2025: RAP is accepted by CVPR 2025.
- 02/2024: OneLLM is accepted by CVPR 2024.
- 01/2024: LLaMA-Adapter is accepted by ICLR 2024!.
- 12/2023: We release OneLLM which aligns eight modalities to language using a unified framework.
- 09/2023: ImageBind-LLM is released at arXiv.
- 05/2023: We release ImageBind-LLM: a LLM connects Image, Video, Audio, Point Cloud and more! Check our demo.
- 04/2023: We release multi-modal instruction model LLaMA-Adapter V2. Check our demo at OpenGVLab.
- 03/2023: We release the paper and code of LLaMA-Adapter.
- 11/2022: One paper on Few-Shot Object Detection is accepted by AAAI 2023.
- 03/2022: We release the paper and code of OpenDet.
- 03/2022: One paper on Open-Set Object Detection is accepted by CVPR 2022.
- 02/2022: Our works S2A-Net and ReDet are included in OpenMMLab’s mmrotate.
- 08/2021: Third-party implementation of S2A-Net with Jittor and PaddlePaddle.
- 03/2021: We release the paper and code of ReDet.
- 02/2021: One paper is accepted by CVPR 2021.
Publications
- Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Jiaming Han, Hao Chen, Yang Zhao, Hanyu Wang, Qi Zhao, Ziyan Yang, Hao He, Xiangyu Yue, Lu Jiang
Project Page | Paper | Code | Models | Demo - CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
Shilin Yan, Jiaming Han, Joey Tsai, Hongwei Xue, Rongyao Fang, Lingyi Hong, Ziyu Guo, Ray Zhang
Paper | Code - Multimodal Long Video Modeling Based on Temporal Dynamic Context
Haoran Hao*, Jiaming Han*, Yiyuan Zhang, Xiangyu Yue
Project Page | Paper | Code - Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Yunhai Feng, Jiaming Han, Zhuoran Yang, Xiangyu Yue, Sergey Levine, Jianlan Luo
CoRL 2025 | Project Page | Paper | Code - Retrieval-Augmented Personalization for Multimodal Large Language Models
Haoran Hao*, Jiaming Han*, Changsheng Li, Yu-Feng Li, Xiangyu Yue
CVPR 2025 | Project Page | Paper | Code - OneLLM: One Framework to Align All Modalities with Language
Jiaming Han, Kaixiong Gong, Yiyuan Zhang, Jiaqi Wang, Kaipeng Zhang
Dahua Lin, Yu Qiao, Peng Gao, Xiangyu Yue.
CVPR 2024 | Project Page | Paper | Code | Demo - ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han*,Renrui Zhang*,Wenqi Shao*,Peng Gao*,Peng Xu*,Han Xiao*
Kaipeng Zhang,Chris Liu,Song Wen,Ziyu Guo,Xudong Lu,Shuai Ren,Yafei Wen
Xiaoxin Chen,Xiangyu Yue,Hongsheng Li,Yu Qiao.
Paper | Code | Demo - LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Peng Gao*, Jiaming Han*, Renrui Zhang*, Ziyi Lin*, Shijie Geng, Aojun Zhou, Wei Zhang
Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao.
Paper | Code | Demo - LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang*, Jiaming Han*, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, Peng Gao, Yu Qiao.
ICLR 2024 | Paper | Code | Demo - Few-Shot Object Detection via Variational Feature Aggregation
Jiaming Han, Yuqiang Ren, Jian Ding, Ke Yan, Gui-Song Xia.
AAAI 2023 | Paper | Code - Expanding Low-Density Latent Regions for Open-Set Object Detection
Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-Song Xia.
CVPR 2022 | Paper | Code - ReDet: A Rotation-Equivariant Detector for Aerial Object Detection
Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia.
CVPR 2021 | Project | Paper | Code | Poster | Slide - Align Deep Features for Oriented Object Detection
Jiaming Han, Jian Ding, Jie Li, Gui-Song Xia.
TGRS 2021 | Paper | Code
Experience
- Bytedance Seed 06/2024 - Present
Unified MLLM with Dr. Lu Jiang - Shanghai AI Lab 10/2022 - 02/2024
Multimodal LLM with Dr. Peng Gao - Tencent YouTu Lab 05/2021 - 05/2022
Object Detection with Dr. Yuqiang Ren
Activities
- Reviewer of CVPR’21-24, ICCV’21-23, NIPS’23, ICLR’24, WACV’22, ECCV’22-24, AAAI’23-24
- Reviewer of TPAMI, IJCV, TIP, ISPRS, TGRS, TNNLS
Educations
- PhD. The Chinese University of Hong Kong. 09/2023 - Present
- M.E. Wuhan University. 09/2019 - 06/2022
- B.E. Central South University. 09/2015 - 06/2019