An Approach for Video Summarization based on Unsupervised Learning using Deep Semantic Features and Keyframe Extraction

Digital videos have emerged as a prominent and influential medium for distributing information, with its consumption on both offline and online platforms have increased unprecedentedly in recent years. One major problem with video information extraction is that, unlike images, where information can...

Full description

Saved in:

Bibliographic Details
Published in	2024 5th International Conference on Image Processing and Capsule Networks (ICIPCN) pp. 520 - 527
Main Authors	Gangwani, V.S., Ramteke, P. L.
Format	Conference Proceeding
Language	English
Published	IEEE 03.07.2024
Subjects	Deep Features Deep Learning Feature extraction Image segmentation Information retrieval Keyframes Extraction Motion pictures Object detection Representation learning Semantics Unsupervised Learning Video summarization Vision based Features
Online Access	Get full text
DOI	10.1109/ICIPCN63822.2024.00091

Cover

Loading…

More Information
Summary:	Digital videos have emerged as a prominent and influential medium for distributing information, with its consumption on both offline and online platforms have increased unprecedentedly in recent years. One major problem with video information extraction is that, unlike images, where information can be collected from a single frame, movies require the user to watch the entire thing to comprehend the context. One of the potential methods for efficiently understanding video material is video summarization, which does so by picking out relevant scenes from the movie scene. As the content of videos varies widely, from home movies to documentaries, making video summarizing considerably more difficult as prior knowledge is nearly not accessible, the goal here is to provide a video summarization that is engaging to the user and accurately represents the context of content using unsupervised learning approach. To overcome this, an unsupervised learning strategy is presented in this research article for video summarization, utilizing classic vision-based algorithmic approaches, SIFT and a deep convolutional neural network learning solution. The deep learning technique is used to tackle the issue in which deep video features represent many layers of content semantics such as actions, scenes and objects to boost the performance of baseline video summarizing methods for accurate feature extraction from video frames. To construct a video summary, the deep features are extracted from the scenes taken directly from the original movie and an unsupervised k-means dustering-based summarizing algorithm is then applied. The video summaries are evaluated against reference approaches and state-of-the-art models on two standard datasets. The proposed framework examines using two widely recognized benchmark datasets. The results demonstrate that the utilization of vision-based feature extraction yields superior performance in the context of video summarization.
DOI:	10.1109/ICIPCN63822.2024.00091