Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Xiao, Bin, Wu, Haiping, Xu, Weijian, Dai, Xiyang, Hu, Houdong, Lu, Yumao, Zeng, Michael, Liu, Ce, Lu, Yuan
Published in arXiv.org (10.11.2023)
Get full text
Published in arXiv.org (10.11.2023)
Paper
Providing recommended contents
Lu, Yumao, Deng, Yongjian, Shou, Linjun, Zhou, Jie, Fan, Baoquan, Pan, Jun, Cai, Wenbin
Year of Publication 08.11.2022
Get full text
Year of Publication 08.11.2022
Patent
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Yang, Zhengyuan, Gan, Zhe, Wang, Jianfeng, Hu, Xiaowei, Lu, Yumao, Liu, Zicheng, Wang, Lijuan
Published in arXiv.org (14.09.2022)
Get full text
Published in arXiv.org (14.09.2022)
Paper
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Yang, Zhengyuan, Gan, Zhe, Wang, Jianfeng, Hu, Xiaowei, Ahmed, Faisal, Liu, Zicheng, Lu, Yumao, Wang, Lijuan
Published in arXiv.org (27.07.2022)
Get full text
Published in arXiv.org (27.07.2022)
Paper
Search system that provides personalized results
Lu, Yumao, Krishnamurthi, Priyanka, Deng, Yongjian, Sureshchandra, Bhimani Kalpesh, Parthasarathy, Ashwin Mallur
Year of Publication 24.05.2022
Get full text
Year of Publication 24.05.2022
Patent
Florence: A New Foundation Model for Computer Vision
Yuan, Lu, Chen, Dongdong, Chen, Yi-Ling, Codella, Noel, Dai, Xiyang, Gao, Jianfeng, Hu, Houdong, Huang, Xuedong, Li, Boxin, Li, Chunyuan, Liu, Ce, Liu, Mengchen, Liu, Zicheng, Lu, Yumao, Shi, Yu, Wang, Lijuan, Wang, Jianfeng, Xiao, Bin, Xiao, Zhen, Yang, Jianwei, Zeng, Michael, Zhou, Luowei, Zhang, Pengchuan
Year of Publication 22.11.2021
Year of Publication 22.11.2021
Get full text
Journal Article
Scaling Up Vision-Language Pre-training for Image Captioning
Hu, Xiaowei, Gan, Zhe, Wang, Jianfeng, Yang, Zhengyuan, Liu, Zicheng, Lu, Yumao, Wang, Lijuan
Published in arXiv.org (26.03.2022)
Get full text
Published in arXiv.org (26.03.2022)
Paper
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Lin, Kevin, Ahmed, Faisal, Li, Linjie, Chung-Ching, Lin, Azarnasab, Ehsan, Yang, Zhengyuan, Wang, Jianfeng, Lin, Liang, Liu, Zicheng, Lu, Yumao, Liu, Ce, Wang, Lijuan
Published in arXiv.org (30.10.2023)
Get full text
Published in arXiv.org (30.10.2023)
Paper
UFO: A UniFied TransfOrmer for Vision-Language Representation Learning
Wang, Jianfeng, Hu, Xiaowei, Gan, Zhe, Yang, Zhengyuan, Dai, Xiyang, Liu, Zicheng, Lu, Yumao, Wang, Lijuan
Published in arXiv.org (19.11.2021)
Get full text
Published in arXiv.org (19.11.2021)
Paper
SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning
Lin, Kevin, Li, Linjie, Chung-Ching, Lin, Ahmed, Faisal, Gan, Zhe, Liu, Zicheng, Lu, Yumao, Wang, Lijuan
Published in arXiv.org (18.06.2022)
Get full text
Published in arXiv.org (18.06.2022)
Paper
PROVIDING RECOMMENDED CONTENTS
Lu, Yumao, Deng, Yongjian, Shou, Linjun, Zhou, Jie, Fan, Baoquan, Pan, Jun, Cai, Wenbin
Year of Publication 20.02.2020
Get full text
Year of Publication 20.02.2020
Patent
Florence: A New Foundation Model for Computer Vision
Lu, Yuan, Chen, Dongdong, Yi-Ling, Chen, Codella, Noel, Dai, Xiyang, Gao, Jianfeng, Hu, Houdong, Huang, Xuedong, Li, Boxin, Li, Chunyuan, Liu, Ce, Liu, Mengchen, Liu, Zicheng, Lu, Yumao, Shi, Yu, Wang, Lijuan, Wang, Jianfeng, Xiao, Bin, Xiao, Zhen, Yang, Jianwei, Zeng, Michael, Zhou, Luowei, Zhang, Pengchuan
Published in arXiv.org (22.11.2021)
Get full text
Published in arXiv.org (22.11.2021)
Paper