Multimodal AI model for zero-shot vehicle brand identification
Identifying vehicle brands is a crucial aspect of advancing media technology in intelligent transportation systems, yet it remains challenging due to the wide variety of car models and the complexities inherent in real-world traffic conditions. This study investigates the potential of OpenAI’s GPT-4...
Saved in:
Published in | Multimedia tools and applications Vol. 84; no. 27; pp. 33125 - 33144 |
---|---|
Main Author | |
Format | Journal Article |
Language | English |
Published |
New York
Springer US
01.08.2025
Springer Nature B.V |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Identifying vehicle brands is a crucial aspect of advancing media technology in intelligent transportation systems, yet it remains challenging due to the wide variety of car models and the complexities inherent in real-world traffic conditions. This study investigates the potential of OpenAI’s GPT-4v, an advanced multimodal language model, in automating the recognition of vehicle makes using the CompCars dataset. Notably, GPT-4v exhibits impressive zero-shot recognition capabilities, identifying both the number of vehicles and their makes without the need for finetuning or additional training. However, the model’s accuracy declines when processing images with multiple vehicles. A more focused analysis on single-vehicle instances highlights consistent difficulties in identifying car makes from China, with a significant number of predictions categorized as UNKNOWN. Additionally, GPT-4v frequently misidentifies Chinese-made vehicles as originating from other countries. These findings suggest that additional training or finetuning may be necessary to enhance GPT-4v’s performance in recognizing Chinese car makes. This research represents the first exploration of GPT-4v for vision-based zero-shot vehicle brand identification, offering valuable insights into its capabilities and limitations and setting the stage for future advancements in automated vehicle recognition technology. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-024-20559-3 |