Vyzkoušejte nový nástroj s podporou AI
Summon Research Assistant
BETA
An fMRI visual neural encoding method with multimodal large language model
Ma, Shuxiao, Wang, Linyuan, Hou, Libin, Hou, Senbao, Yan, Bin
Published in Knowledge-based systems (27.09.2025)
Published in Knowledge-based systems (27.09.2025)
Get full text
Journal Article
GSVA: Generalized Segmentation via Multimodal Large Language Models
Xia, Zhuofan, Han, Dongchen, Han, Yizeng, Pan, Xuran, Song, Shiji, Huang, Gao
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (16.06.2024)
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (16.06.2024)
Get full text
Conference Proceeding
Analyzing the performance of multimodal large language models on visually-based questions in the Japanese National Examination for Dental Technicians
Mine, Yuichi, Taji, Tsuyoshi, Okazaki, Shota, Takeda, Saori, Peng, Tzu-Yu, Shimoe, Saiji, Kaku, Masato, Nikawa, Hiroki, Kakimoto, Naoya, Murayama, Takeshi
Published in Journal of dental sciences (2025)
Published in Journal of dental sciences (2025)
Get full text
Journal Article
MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Yue, Xiang, Ni, Yuansheng, Zheng, Tianyu, Zhang, Kai, Liu, Ruoqi, Zhang, Ge, Stevens, Samuel, Jiang, Dongfu, Ren, Weiming, Sun, Yuxuan, Wei, Cong, Yu, Botao, Yuan, Ruibin, Sun, Renliang, Yin, Ming, Zheng, Boyuan, Yang, Zhenzhu, Liu, Yibo, Huang, Wenhao, Sun, Huan, Su, Yu, Chen, Wenhu
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (16.06.2024)
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (16.06.2024)
Get full text
Conference Proceeding
MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
Liang, Qian, Wu, Yujia, Li, Kuncheng, Wei, Jiwei, He, Shiyuan, Guo, Jinyu, Xie, Ning
Year of Publication 26.08.2025
Year of Publication 26.08.2025
Get full text
Journal Article
MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
Ruan, Jiacheng, Jiang, Dan, Gao, Xian, Liu, Ting, Fu, Yuzhuo, Kang, Yangyang
Year of Publication 19.08.2025
Year of Publication 19.08.2025
Get full text
Journal Article
GSVA: Generalized Segmentation via Multimodal Large Language Models
Xia, Zhuofan, Han, Dongchen, Han, Yizeng, Pan, Xuran, Song, Shiji, Huang, Gao
Year of Publication 14.12.2023
Year of Publication 14.12.2023
Get full text
Journal Article
Towards Zero-Shot Differential Morphing Attack Detection with Multimodal Large Language Models
Shekhawat, Ria, Li, Hailin, Ramachandra, Raghavendra, Venkatesh, Sushma
Published in IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG (26.05.2025)
Published in IEEE International Conference and Workshops on Automatic Face and Gesture Recognition : FG (26.05.2025)
Get full text
Conference Proceeding
Towards Zero-Shot Differential Morphing Attack Detection with Multimodal Large Language Models
Shekhawat, Ria, Li, Hailin, Ramachandra, Raghavendra, Venkatesh, Sushma
Year of Publication 21.05.2025
Year of Publication 21.05.2025
Get full text
Journal Article
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Pan, Xichen, Dong, Li, Huang, Shaohan, Peng, Zhiliang, Chen, Wenhu, Wei, Furu
Year of Publication 04.10.2023
Year of Publication 04.10.2023
Get full text
Journal Article
GSVA: Generalized Segmentation via Multimodal Large Language Models
Xia, Zhuofan, Han, Dongchen, Han, Yizeng, Pan, Xuran, Song, Shiji, Huang, Gao
Published in arXiv.org (21.03.2024)
Get full text
Published in arXiv.org (21.03.2024)
Paper
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
Pan, Xichen, Li, Dong, Huang, Shaohan, Peng, Zhiliang, Chen, Wenhu, Furu Wei
Published in arXiv.org (15.03.2024)
Get full text
Published in arXiv.org (15.03.2024)
Paper
AV-FOS: A Transformer-Based Audio-Visual Multi-modal Interaction Style Recognition for Children with Autism Based on the Family Observation Schedule (FOS-II)
Zhao, Zhenhao, Chung, Eunsun, Chung, Kyong-Mee, Park, Chung Hyuk
Published in IEEE journal of biomedical and health informatics (13.02.2025)
Published in IEEE journal of biomedical and health informatics (13.02.2025)
Get full text
Journal Article