Relevance- and interface-driven clustering for visual information retrieval

Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate...

Full description

Saved in:
Bibliographic Details
Published inInformation systems (Oxford) Vol. 94; p. 101592
Main Authors Bouadjenek, Mohamed Reda, Sanner, Scott, Du, Yihao
Format Journal Article
LanguageEnglish
Published Oxford Elsevier Ltd 01.12.2020
Elsevier Science Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate results via a clustering method. However, standard unsupervised clustering algorithms like K-means (i) ignore relevance scores that can help with the extraction of highly relevant clusters, and (ii) do not necessarily optimize search results for purposes of visual presentation. In this article, we address both deficiencies by framing the clustering problem for search-driven user interfaces in a novel optimization framework that (i) aims to maximize the relevance of aggregated content according to cluster-based extensions of standard information retrieval metrics and (ii) defines clusters via constraints that naturally reflect interface-driven desiderata of spatial, temporal, and keyword coherence that do not require complex ad-hoc distance metric specifications as in K-means. After comparatively benchmarking algorithmic variants of our proposed approach – RadiCAL – in offline experiments, we undertake a user study with 24 subjects to evaluate whether RadiCAL improves human performance on visual search tasks in comparison to K-means clustering and a filtered search baseline. Our results show that (a) our binary partitioning search (BPS) variant of RadiCAL is fast, near-optimal, and extracts higher-relevance clusters than K-means, and (b) clusters optimized via RadiCAL result in faster search task completion with higher accuracy while requiring a minimum workload leading to high effectiveness, efficiency, and user satisfaction among alternatives. •We present a novel relevance-driven clustering algorithm for Visual IR.•We present expected F1-score (EF1) as a new objective for clustering in IR.•We demonstrate that the optimal solution to EF1 maximization can be cast as a MILP.•We present two efficient greedy algorithms for optimizing EF1.•Experiments show that relevance-driven clustering improves user performance.•We provide the source code as an open-source library.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0306-4379
1873-6076
DOI:10.1016/j.is.2020.101592