An Approach for Video Summarization based on Unsupervised Learning using Deep Semantic Features and Keyframe Extraction
Digital videos have emerged as a prominent and influential medium for distributing information, with its consumption on both offline and online platforms have increased unprecedentedly in recent years. One major problem with video information extraction is that, unlike images, where information can...
Saved in:
Published in | 2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN) pp. 520 - 527 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
03.07.2024
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ICIPCN63822.2024.00091 |
Cover
Loading…
Summary: | Digital videos have emerged as a prominent and influential medium for distributing information, with its consumption on both offline and online platforms have increased unprecedentedly in recent years. One major problem with video information extraction is that, unlike images, where information can be collected from a single frame, movies require the user to watch the entire thing to comprehend the context. One of the potential methods for efficiently understanding video material is video summarization, which does so by picking out relevant scenes from the movie scene. As the content of videos varies widely, from home movies to documentaries, making video summarizing considerably more difficult as prior knowledge is nearly not accessible, the goal here is to provide a video summarization that is engaging to the user and accurately represents the context of content using unsupervised learning approach. To overcome this, an unsupervised learning strategy is presented in this research article for video summarization, utilizing classic vision-based algorithmic approaches, SIFT and a deep convolutional neural network learning solution. The deep learning technique is used to tackle the issue in which deep video features represent many layers of content semantics such as actions, scenes and objects to boost the performance of baseline video summarizing methods for accurate feature extraction from video frames. To construct a video summary, the deep features are extracted from the scenes taken directly from the original movie and an unsupervised k-means dustering-based summarizing algorithm is then applied. The video summaries are evaluated against reference approaches and state-of-the-art models on two standard datasets. The proposed framework examines using two widely recognized benchmark datasets. The results demonstrate that the utilization of vision-based feature extraction yields superior performance in the context of video summarization. |
---|---|
DOI: | 10.1109/ICIPCN63822.2024.00091 |