GenAI Arena: An Open Evaluation Platform for Generative Models
Jiang, Dongfu, Ku, Max, Li, Tianle, Ni, Yuansheng, Sun, Shizhuo, Fan, Rongqi, Chen, Wenhu
Year of Publication 06.06.2024
Year of Publication 06.06.2024
Get full text
Journal Article
MANTIS: Interleaved Multi-Image Instruction Tuning
Jiang, Dongfu, He, Xuan, Zeng, Huaye, Wei, Cong, Ku, Max, Liu, Qian, Chen, Wenhu
Year of Publication 02.05.2024
Year of Publication 02.05.2024
Get full text
Journal Article
ImagenHub: Standardizing the evaluation of conditional image generation models
Ku, Max, Li, Tianle, Zhang, Kai, Lu, Yujie, Fu, Xingyu, Zhuang, Wenwen, Chen, Wenhu
Year of Publication 02.10.2023
Year of Publication 02.10.2023
Get full text
Journal Article
TheoremQA: A Theorem-driven Question Answering dataset
Chen, Wenhu, Yin, Ming, Ku, Max, Lu, Pan, Wan, Yixin, Ma, Xueguang, Xu, Jianyu, Wang, Xinyi, Xia, Tony
Year of Publication 21.05.2023
Year of Publication 21.05.2023
Get full text
Journal Article
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
He, Xuan, Jiang, Dongfu, Zhang, Ge, Ku, Max, Soni, Achint, Siu, Sherman, Chen, Haonan, Chandra, Abhranil, Jiang, Ziyan, Arulraj, Aaran, Wang, Kai, Do, Quy Duc, Ni, Yuansheng, Lyu, Bohan, Narsupalli, Yaswanth, Fan, Rongqi, Lyu, Zhiheng, Lin, Yuchen, Chen, Wenhu
Year of Publication 21.06.2024
Year of Publication 21.06.2024
Get full text
Journal Article
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Wang, Yubo, Ma, Xueguang, Zhang, Ge, Ni, Yuansheng, Chandra, Abhranil, Guo, Shiguang, Ren, Weiming, Arulraj, Aaran, He, Xuan, Jiang, Ziyan, Li, Tianle, Ku, Max, Wang, Kai, Zhuang, Alex, Fan, Rongqi, Yue, Xiang, Chen, Wenhu
Year of Publication 03.06.2024
Year of Publication 03.06.2024
Get full text
Journal Article
GenAI Arena: An Open Evaluation Platform for Generative Models
Jiang, Dongfu, Ku, Max, Li, Tianle, Ni, Yuansheng, Sun, Shizhuo, Fan, Rongqi, Chen, Wenhu
Published in arXiv.org (06.08.2024)
Get full text
Published in arXiv.org (06.08.2024)
Paper
ImagenHub: Standardizing the evaluation of conditional image generation models
Ku, Max, Li, Tianle, Zhang, Kai, Lu, Yujie, Fu, Xingyu, Zhuang, Wenwen, Chen, Wenhu
Published in arXiv.org (10.03.2024)
Get full text
Published in arXiv.org (10.03.2024)
Paper
TheoremQA: A Theorem-driven Question Answering dataset
Chen, Wenhu, Yin, Ming, Ku, Max, Pan, Lu, Wan, Yixin, Ma, Xueguang, Xu, Jianyu, Wang, Xinyi, Xia, Tony
Published in arXiv.org (06.12.2023)
Get full text
Published in arXiv.org (06.12.2023)
Paper
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
He, Xuan, Jiang, Dongfu, Zhang, Ge, Ku, Max, Soni, Achint, Sherman, Siu, Chen, Haonan, Chandra, Abhranil, Jiang, Ziyan, Arulraj, Aaran, Wang, Kai, Quy Duc Do, Ni, Yuansheng, Lyu, Bohan, Narsupalli, Yaswanth, Fan, Rongqi, Lyu, Zhiheng, Lin, Yuchen, Chen, Wenhu
Published in arXiv.org (24.06.2024)
Get full text
Published in arXiv.org (24.06.2024)
Paper
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Wang, Yubo, Ma, Xueguang, Zhang, Ge, Ni, Yuansheng, Chandra, Abhranil, Guo, Shiguang, Ren, Weiming, Arulraj, Aaran, He, Xuan, Jiang, Ziyan, Li, Tianle, Ku, Max, Wang, Kai, Zhuang, Alex, Fan, Rongqi, Yue, Xiang, Chen, Wenhu
Published in arXiv.org (23.06.2024)
Get full text
Published in arXiv.org (23.06.2024)
Paper