TVR-Ranking: A Dataset for Ranked Video Moment Retrieval with Imprecise Queries

In this paper, we propose the task of \textit{Ranked Video Moment Retrieval} (RVMR) to locate a ranked list of matching moments from a collection of videos, through queries in natural language. Although a few related tasks have been proposed and studied by CV, NLP, and IR communities, RVMR is the ta...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Liang, Renjie, Li, Li, Zhang, Chongzhi, Wang, Jing, Zhu, Xizhou, Sun, Aixin
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 24.07.2024
Subjects	Annotations Datasets Queries Ranking Retrieval Video
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we propose the task of \textit{Ranked Video Moment Retrieval} (RVMR) to locate a ranked list of matching moments from a collection of videos, through queries in natural language. Although a few related tasks have been proposed and studied by CV, NLP, and IR communities, RVMR is the task that best reflects the practical setting of moment search. To facilitate research in RVMR, we develop the TVR-Ranking dataset, based on the raw videos and existing moment annotations provided in the TVR dataset. Our key contribution is the manual annotation of relevance levels for 94,442 query-moment pairs. We then develop the \(NDCG@K, IoU\geq \mu\) evaluation metric for this new task and conduct experiments to evaluate three baseline models. Our experiments show that the new RVMR task brings new challenges to existing models and we believe this new dataset contributes to the research on multi-modality search. The dataset is available at \url{https://github.com/Ranking-VMR/TVR-Ranking}
ISSN:	2331-8422