Relevance- and interface-driven clustering for visual information retrieval
Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate...
Saved in:
Published in | Information systems (Oxford) Vol. 94; p. 101592 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Elsevier Ltd
01.12.2020
Elsevier Science Ltd |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Search results of spatio-temporal data are often displayed on a map, but when the number of matching search results is large, it can be time-consuming to individually examine all results, even when using methods such as filtered search to narrow the content focus. This suggests the need to aggregate results via a clustering method. However, standard unsupervised clustering algorithms like K-means (i) ignore relevance scores that can help with the extraction of highly relevant clusters, and (ii) do not necessarily optimize search results for purposes of visual presentation. In this article, we address both deficiencies by framing the clustering problem for search-driven user interfaces in a novel optimization framework that (i) aims to maximize the relevance of aggregated content according to cluster-based extensions of standard information retrieval metrics and (ii) defines clusters via constraints that naturally reflect interface-driven desiderata of spatial, temporal, and keyword coherence that do not require complex ad-hoc distance metric specifications as in K-means. After comparatively benchmarking algorithmic variants of our proposed approach – RadiCAL – in offline experiments, we undertake a user study with 24 subjects to evaluate whether RadiCAL improves human performance on visual search tasks in comparison to K-means clustering and a filtered search baseline. Our results show that (a) our binary partitioning search (BPS) variant of RadiCAL is fast, near-optimal, and extracts higher-relevance clusters than K-means, and (b) clusters optimized via RadiCAL result in faster search task completion with higher accuracy while requiring a minimum workload leading to high effectiveness, efficiency, and user satisfaction among alternatives.
•We present a novel relevance-driven clustering algorithm for Visual IR.•We present expected F1-score (EF1) as a new objective for clustering in IR.•We demonstrate that the optimal solution to EF1 maximization can be cast as a MILP.•We present two efficient greedy algorithms for optimizing EF1.•Experiments show that relevance-driven clustering improves user performance.•We provide the source code as an open-source library. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 0306-4379 1873-6076 |
DOI: | 10.1016/j.is.2020.101592 |