Sparse-Tuning: Adapting Vision Transformers with Efficient Fine-tuning and Inference
Liu, Ting, Liu, Xuyang, Huang, Siteng, Shi, Liangtao, Xu, Zunnan, Xin, Yi, Yin, Quanjun, Liu, Xiaohong
Year of Publication 23.05.2024
Year of Publication 23.05.2024
Get full text
Journal Article
DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding
Liu, Ting, Liu, Xuyang, Huang, Siteng, Chen, Honggang, Yin, Quanjun, Qin, Long, Wang, Donglin, Hu, Yue
Year of Publication 09.05.2024
Year of Publication 09.05.2024
Get full text
Journal Article
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
Zhao, Han, Zhang, Min, Zhao, Wei, Ding, Pengxiang, Huang, Siteng, Wang, Donglin
Year of Publication 21.03.2024
Year of Publication 21.03.2024
Get full text
Journal Article
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Ding, Pengxiang, Zhao, Han, Song, Wenxuan, Zhang, Wenjie, Zhang, Min, Huang, Siteng, Yang, Ningxi, Wang, Donglin
Year of Publication 22.12.2023
Year of Publication 22.12.2023
Get full text
Journal Article
Prompt-based Distribution Alignment for Unsupervised Domain Adaptation
Bai, Shuanghao, Zhang, Min, Zhou, Wanqi, Huang, Siteng, Luan, Zhirong, Wang, Donglin, Chen, Badong
Year of Publication 15.12.2023
Year of Publication 15.12.2023
Get full text
Journal Article
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Gong, Biao, Huang, Siteng, Feng, Yutong, Zhang, Shiwei, Li, Yuyuan, Liu, Yu
Year of Publication 27.11.2023
Year of Publication 27.11.2023
Get full text
Journal Article
Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation
Huang, Siteng, Gong, Biao, Feng, Yutong, Chen, Xi, Fu, Yuqian, Liu, Yu, Wang, Donglin
Year of Publication 27.11.2023
Year of Publication 27.11.2023
Get full text
Journal Article
Troika: Multi-Path Cross-Modal Traction for Compositional Zero-Shot Learning
Huang, Siteng, Gong, Biao, Feng, Yutong, Zhang, Min, Lv, Yiliang, Wang, Donglin
Year of Publication 27.03.2023
Year of Publication 27.03.2023
Get full text
Journal Article
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Huang, Siteng, Gong, Biao, Pan, Yulin, Jiang, Jianwen, Lv, Yiliang, Li, Yuyuan, Wang, Donglin
Year of Publication 23.11.2022
Year of Publication 23.11.2022
Get full text
Journal Article
Accelerating Diffusion Transformers with Token-wise Feature Caching
Zou, Chang, Liu, Xuyang, Liu, Ting, Huang, Siteng, Zhang, Linfeng
Published in arXiv.org (14.10.2024)
Get full text
Published in arXiv.org (14.10.2024)
Paper
Focus-Consistent Multi-Level Aggregation for Compositional Zero-Shot Learning
Dai, Fengyuan, Huang, Siteng, Zhang, Min, Gong, Biao, Wang, Donglin
Published in arXiv.org (30.08.2024)
Get full text
Published in arXiv.org (30.08.2024)
Paper
VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders
Liu, Xuyang, Huang, Siteng, Kang, Yachen, Chen, Honggang, Wang, Donglin
Published in arXiv.org (23.01.2024)
Get full text
Published in arXiv.org (23.01.2024)
Paper
ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification
Cui, Can, Huang, Siteng, Song, Wenxuan, Ding, Pengxiang, Zhang, Min, Wang, Donglin
Published in arXiv.org (30.09.2024)
Get full text
Published in arXiv.org (30.09.2024)
Paper
PiTe: Pixel-Temporal Alignment for Large Video-Language Model
Liu, Yang, Ding, Pengxiang, Huang, Siteng, Zhang, Min, Zhao, Han, Wang, Donglin
Published in arXiv.org (11.09.2024)
Get full text
Published in arXiv.org (11.09.2024)
Paper