ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization
In recent years, there has been an increasing interest in building video summarization tools, where the goal is to automatically create a short summary of an input video that properly represents the original content. We consider shot-based video summarization where the summary consists of a subset o...
Saved in:
Main Authors | , , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.12.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In recent years, there has been an increasing interest in building video
summarization tools, where the goal is to automatically create a short summary
of an input video that properly represents the original content. We consider
shot-based video summarization where the summary consists of a subset of the
video shots which can be of various lengths. A straightforward approach to
maximize the representativeness of a subset of shots is by minimizing the total
distance between shots and their nearest selected shots. We formulate the task
of video summarization as an optimization problem with a knapsack-like
constraint on the total summary duration. Previous studies have proposed greedy
algorithms to solve this problem approximately, but no experiments were
presented to measure the ability of these methods to obtain solutions with low
total distance. Indeed, our experiments on video summarization datasets show
that the success of current methods in obtaining results with low total
distance still has much room for improvement. In this paper, we develop
ILS-SUMM, a novel video summarization algorithm to solve the subset selection
problem under the knapsack constraint. Our algorithm is based on the well-known
metaheuristic optimization framework -- Iterated Local Search (ILS), known for
its ability to avoid weak local minima and obtain a good near-global minimum.
Extensive experiments show that our method finds solutions with significantly
better total distance than previous methods. Moreover, to indicate the high
scalability of ILS-SUMM, we introduce a new dataset consisting of videos of
various lengths. |
---|---|
DOI: | 10.48550/arxiv.1912.03650 |