avatar

Jiaming Han

be simple, be happy

Jiaming Han
PhD Student · MMLab, The Chinese University of Hong Kong

Biography

I am a third-year PhD student of MMLab@CUHK advised by Prof. Xiangyu Yue. Before that, I received my Master and Bachelor degree from Wuhan University and Central South University, respectively. I also interned at Bytedance Seed, Shanghai AI Lab and Tencent YouTu Lab.

I have board interest in computer vision, natural language processing and deep learning. Recently, I focus on the unification of vision and language in one autoregressive model, such as multimodal LLMs, unified multimodal models and autoregressive generative models.

I expect to graduate in Summer 2027 and am actively seeking full-time job opportunities. Feel free to contact me via Email and Wechat.

Selected Publications

BitDance: Scaling Autoregressive Generative Models with Binary Tokens
BitDance: Scaling Autoregressive Generative Models with Binary Tokens
Yuang Ai*, Jiaming Han*, Shaobin Zhuang*, Weijia Mao, Xuefeng Hu, Ziyan Yang, Zhenheng Yang, Huaibo Huang, Xiangyu Yue, Hao Chen
UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^128 for Unified Multimodal Large Language Model
UniWeTok: An Unified Binary Tokenizer with Codebook Size 2128 for Unified Multimodal Large Language Model
Shaobin Zhuang*, Yuang Ai*, Jiaming Han*, Weijia Mao, Xiaohui Li, Fangyikang Wang, Xiao Wang, Yan Li, Shanchuan Lin, Kun Xu, Zhenheng Yang, Huaibo Huang, Xiangyu Yue, Hao Chen, Yali Wang
Bridge: Growing Visual Generative Capacity for Pre-Trained MLLMs
Bridge: Growing Visual Generative Capacity for Pre-Trained MLLMs
Hanyu Wang*, Jiaming Han*, Ziyan Yang, Qi Zhao, Shanchuan Lin, Xiangyu Yue, Abhinav Shrivastava, Zhenheng Yang, Hao Chen
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Jiaming Han, Hao Chen, Yang Zhao, Hanyu Wang, Qi Zhao, Ziyan Yang, Hao He, Xiangyu Yue, Lu Jiang
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
Shilin Yan, Jiaming Han, Joey Tsai, Hongwei Xue, Rongyao Fang, Lingyi Hong, Ziyu Guo, Ray Zhang
Multimodal Long Video Modeling Based on Temporal Dynamic Context
Multimodal Long Video Modeling Based on Temporal Dynamic Context
Haoran Hao*, Jiaming Han*, Yiyuan Zhang, Xiangyu Yue
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
Yunhai Feng, Jiaming Han, Zhuoran Yang, Xiangyu Yue, Sergey Levine, Jianlan Luo
Retrieval-Augmented Personalization for Multimodal Large Language Models
Retrieval-Augmented Personalization for Multimodal Large Language Models
Haoran Hao*, Jiaming Han*, Changsheng Li, Yu-Feng Li, Xiangyu Yue
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with Language
Jiaming Han, Kaixiong Gong, Yiyuan Zhang, Jiaqi Wang, Kaipeng Zhang, Dahua Lin, Yu Qiao, Peng Gao, Xiangyu Yue.
ImageBind-LLM: Multi-modality Instruction Tuning
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han*, Renrui Zhang*, Wenqi Shao*, Peng Gao*, Peng Xu*, Han Xiao*, Kaipeng Zhang, Chris Liu, Song Wen, Ziyu Guo, Xudong Lu, Shuai Ren, Yafei Wen, Xiaoxin Chen, Xiangyu Yue, Hongsheng Li, Yu Qiao.
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Peng Gao*, Jiaming Han*, Renrui Zhang*, Ziyi Lin*, Shijie Geng, Aojun Zhou, Wei Zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao.
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang*, Jiaming Han*, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, Peng Gao, Yu Qiao.
Few-Shot Object Detection via Variational Feature Aggregation
Few-Shot Object Detection via Variational Feature Aggregation
Jiaming Han, Yuqiang Ren, Jian Ding, Ke Yan, Gui-Song Xia.
Expanding Low-Density Latent Regions for Open-Set Object Detection
Expanding Low-Density Latent Regions for Open-Set Object Detection
Jiaming Han, Yuqiang Ren, Jian Ding, Xingjia Pan, Ke Yan, Gui-Song Xia.
ReDet: A Rotation-Equivariant Detector for Aerial Object Detection
ReDet: A Rotation-Equivariant Detector for Aerial Object Detection
Jiaming Han, Jian Ding, Nan Xue, Gui-Song Xia.
Align Deep Features for Oriented Object Detection
Align Deep Features for Oriented Object Detection
Jiaming Han, Jian Ding, Jie Li, Gui-Song Xia.

Experience

06/2024 - Present
Bytedance Seed

Unified MLLM with Dr. Lu Jiang

10/2022 - 02/2024
Shanghai AI Lab

Multimodal LLM with Dr. Peng Gao

05/2021 - 05/2022
Tencent YouTu Lab

Object Detection with Dr. Yuqiang Ren

Educations

09/2023 - Present
PhD. The Chinese University of Hong Kong
09/2019 - 06/2022
M.E. Wuhan University
09/2015 - 06/2019
B.E. Central South University

Activities

  • Reviewer of CVPR’21-24, ICCV’21-23, NIPS’23, ICLR’24, WACV’22, ECCV’22-24, AAAI’23-24
  • Reviewer of TPAMI, IJCV, TIP, ISPRS, TGRS, TNNLS