Unsupervised Generation of Context-Relevant Training-Sets for Visual Object Recognition Employing Multilinguality

Image based object classification requires clean training data sets. Gathering such sets is usually done manually by humans, which is time-consuming and laborious. On the other hand, directly using images from search engines creates very noisy data due to ambiguous noun-focused indexing. However, in...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE Winter Conference on Applications of Computer Vision pp. 805 - 812
Main Authors Schoeler, Markus, Worgotter, Florentin, Kulvicius, Tomas, Papon, Jeremie
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.01.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Image based object classification requires clean training data sets. Gathering such sets is usually done manually by humans, which is time-consuming and laborious. On the other hand, directly using images from search engines creates very noisy data due to ambiguous noun-focused indexing. However, in daily speech nouns and verbs are always coupled. We use this for the automatic generation of clean data sets by the here-presented TRANSCLEAN algorithm, which through the use of multiple languages also solves the problem of polyesters (a single spelling with multiple meanings). Thus, we use the implicit knowledge contained in verbs, e.g. in an imperative such as "hit the nail", implicating a metal nail and not the fingernail. One type of reference application where this method can automatically operate is human-robot collaboration based on discourse. A second is the generation of clean image data sets, where tedious manual cleaning can be replaced by the much simpler manual generation of a single relevant verb-noun tuple. Here we show the impact of our improved training sets for several widely used and state-of-the-art classifiers including Multipath Hierarchical Matching Pursuit. All tested classifiers show a substantial boost of about +20% in recognition performance.
ISSN:1550-5790
2642-9381
DOI:10.1109/WACV.2015.112