From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models
Hallucinations in large vision-language models (LVLMs) are a significant challenge, i.e., generating objects that are not presented in the visual input, which impairs their reliability. Recent studies often attribute hallucinations to a lack of understanding of visual input, yet ignore a more fundam...
Saved in:
Main Authors | , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
09.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Be the first to leave a comment!