MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark
Evaluating instruction following capabilities for multimodal, multi-turn dialogue is challenging. With potentially multiple instructions in the input model context, the task is time-consuming for human raters and we show LLM based judges are biased towards answers from the same model. We propose MMM...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
26.09.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!