Multilevel Deep Learning-Based Processing for Lifelog Image Retrieval Enhancement

Remembering an event or a meeting, recalling the face or the name of a person, keeping in mind what we ate or the place of a lost object is sometimes a difficult task. The human memory has its limits. In order to go beyond these limits, researchers developed sensors and wearable cameras to capture i...

Full description

Saved in:
Bibliographic Details
Published inConference proceedings - IEEE International Conference on Systems, Man, and Cybernetics pp. 1348 - 1354
Main Authors Ben Abdallah, Fatma, Feki, Ghada, Ben Ammar, Anis, Ben Amar, Chokri
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.10.2018
Subjects
Online AccessGet full text
ISSN2577-1655
DOI10.1109/SMC.2018.00236

Cover

Loading…
More Information
Summary:Remembering an event or a meeting, recalling the face or the name of a person, keeping in mind what we ate or the place of a lost object is sometimes a difficult task. The human memory has its limits. In order to go beyond these limits, researchers developed sensors and wearable cameras to capture individual's experiences. This trend called lifelog has recently been the subject of several panels, workshops and benchmarks. By analyzing the lifelog tasks of these events more closely, we notice that there are still challenges in managing, analyzing, indexing, retrieving, summarizing and visualizing the captured data. In this work, we present a multilevel deep learning-based processing for lifelog image retrieval enhancement. Our proposed approach is based on five phases in which we use deep learning at several levels. The first phase consists of data pre-processing based on low-level image features to filter out irrelevant, noisy and blurred images. In the second phase, we detect and cross high-level image features using pre-trained CNN to enhance the metadata image description. Then, we manage a semantic segmentation based on the WU-Palmer measure similarity. This segmentation is performed to limit the search area and to control better the runtime and the complexity. The fourth phase consist in analyzing the query using LSTM to match concepts with queries. The final phase which based on doc2sequence aims at retrieving the images that is answering the query.
ISSN:2577-1655
DOI:10.1109/SMC.2018.00236