Loading…
Modality attention fusion model with hybrid multi-head self-attention for video understanding
Zhuang, Xuqiang, Liu, Fang’ai, Hou, Jian, Hao, Jianhua, Cai, Xiaohong
Published in PloS one (06.10.2022)
Published in PloS one (06.10.2022)
Get full text
Journal Article
Loading…
Topology Dictionary for 3D Video Understanding
Tung, T., Matsuyama, T.
Published in IEEE transactions on pattern analysis and machine intelligence (01.08.2012)
Published in IEEE transactions on pattern analysis and machine intelligence (01.08.2012)
Get full text
Journal Article
Loading…
Loading…
Towards Multi-Sweep Ultrasound Video Understanding: Application in Detection of Breech Position Using Statistical Priors
Gleed, A. D., Mishra, D., Chandramohan, V., Fu, Z., Self, A., Bhatnagar, S., Papageorghiou, A. T., Noble, J. A.
Published in Proceedings (International Symposium on Biomedical Imaging) (18.04.2023)
Published in Proceedings (International Symposium on Biomedical Imaging) (18.04.2023)
Get full text
Conference Proceeding
Loading…
Instrument-Tissue-Guided Surgical Action Triplet Detection via Textual-Temporal Trail Exploration
Pei, Jialun, Zhang, Jiaan, Qin, Guanyi, Wang, Kai, Jin, Yueming, Heng, Pheng-Ann
Published in IEEE transactions on medical imaging (18.07.2025)
Published in IEEE transactions on medical imaging (18.07.2025)
Get full text
Journal Article
Loading…
Loading…
Loading…
Multi-modal temporal action segmentation for manufacturing scenarios
Romeo, Laura, Marani, Roberto, Perri, Anna Gina, Gall, Juergen
Published in Engineering applications of artificial intelligence (15.05.2025)
Published in Engineering applications of artificial intelligence (15.05.2025)
Get full text
Journal Article
Loading…
AV-FOS: A Transformer-Based Audio-Visual Multi-modal Interaction Style Recognition for Children with Autism Based on the Family Observation Schedule (FOS-II)
Zhao, Zhenhao, Chung, Eunsun, Chung, Kyong-Mee, Park, Chung Hyuk
Published in IEEE journal of biomedical and health informatics (13.02.2025)
Published in IEEE journal of biomedical and health informatics (13.02.2025)
Get full text
Journal Article
Loading…
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Zhao, Yilun, Zhang, Haowei, Xie, Lujing, Hu, Tongyan, Gan, Guo, Long, Yitao, Hu, Zhiyuan, Chen, Weiyuan, Li, Chuhan, Xu, Zhijian, Wang, Chengye, Shangguan, Ziyao, Liang, Zhenwen, Liu, Yixin, Zhao, Chen, Cohan, Arman
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (10.06.2025)
Published in Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) (10.06.2025)
Get full text
Conference Proceeding
Loading…
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Zhao, Yilun, Xie, Lujing, Zhang, Haowei, Gan, Guo, Long, Yitao, Hu, Zhiyuan, Hu, Tongyan, Chen, Weiyuan, Li, Chuhan, Song, Junyang, Xu, Zhijian, Wang, Chengye, Pan, Weifeng, Shangguan, Ziyao, Tang, Xiangru, Liang, Zhenwen, Liu, Yixin, Zhao, Chen, Cohan, Arman
Year of Publication 21.01.2025
Year of Publication 21.01.2025
Get full text
Journal Article
Loading…
Loading…
Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring
Qiu, Jianing, Lo, Frank P.-W., Jiang, Shuo, Tsai, Ya-Yen, Sun, Yingnan, Lo, Benny
Published in IEEE journal of biomedical and health informatics (01.05.2021)
Published in IEEE journal of biomedical and health informatics (01.05.2021)
Get full text
Journal Article
Loading…
Video Visual Relation Detection with Contextual Knowledge Embedding
Cao, Qianwen, Huang, Heyan
Published in IEEE transactions on knowledge and data engineering (01.12.2023)
Published in IEEE transactions on knowledge and data engineering (01.12.2023)
Get full text
Journal Article
Loading…
Knowledge representation and learning of operator clinical workflow from full-length routine fetal ultrasound scan videos
Sharma, Harshita, Drukker, Lior, Chatelain, Pierre, Droste, Richard, Papageorghiou, Aris T., Noble, J. Alison
Published in Medical image analysis (01.04.2021)
Published in Medical image analysis (01.04.2021)
Get full text
Journal Article
Loading…
VideoMind: An Omni-Modal Video Dataset with Intent Grounding for Deep-Cognitive Video Understanding
Yang, Baoyao, Li, Wanyun, Chen, Dixin, Chen, Junxiang, Yao, Wenbin, Lin, Haifeng
Year of Publication 24.07.2025
Year of Publication 24.07.2025
Get full text
Journal Article
Loading…
3DMesh-GAR: 3D Human Body Mesh-Based Method for Group Activity Recognition
Saqlain, Muhammad, Kim, Donguk, Cha, Junuk, Lee, Changhwa, Lee, Seongyeong, Baek, Seungryul
Published in Sensors (Basel, Switzerland) (14.02.2022)
Published in Sensors (Basel, Switzerland) (14.02.2022)
Get full text
Journal Article
Loading…
SFF-DA: Spatiotemporal Feature Fusion for Nonintrusively Detecting Anxiety
Mo, Haimiao, Li, Yuchen, Han, Peng, Liao, Xiao, Zhang, Wei, Ding, Shuai
Published in IEEE transactions on instrumentation and measurement (01.01.2024)
Published in IEEE transactions on instrumentation and measurement (01.01.2024)
Get full text
Journal Article
Loading…