Transfer Metric Learning for Unseen Domains

We propose a transfer metric learning method to infer domain-specific data embeddings for unseen domains, from which no data are given in the training phase, by using knowledge transferred from related domains. When training and test distributions are different, the standard metric learning cannot i...

Full description

Saved in:
Bibliographic Details
Published inData Science and Engineering Vol. 5; no. 2; pp. 140 - 151
Main Authors Kumagai, Atsutoshi, Iwata, Tomoharu, Fujiwara, Yasuhiro
Format Journal Article
LanguageEnglish
Published Berlin/Heidelberg Springer Berlin Heidelberg 01.06.2020
Springer
Springer Nature B.V
SpringerOpen
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We propose a transfer metric learning method to infer domain-specific data embeddings for unseen domains, from which no data are given in the training phase, by using knowledge transferred from related domains. When training and test distributions are different, the standard metric learning cannot infer appropriate data embeddings. The proposed method can infer appropriate data embeddings for the unseen domains by using latent domain vectors, which are latent representations of domains and control the property of data embeddings for each domain. This latent domain vector is inferred by using a neural network that takes the set of feature vectors in the domain as an input. The neural network is trained without the unseen domains. The proposed method can instantly infer data embeddings for the unseen domains without (re)-training once the sets of feature vectors in the domains are given. To accumulate knowledge in advance, the proposed method uses labeled and unlabeled data in multiple source domains. Labeled data, i.e., data with label information such as class labels or pair (similar/dissimilar) constraints, are used for learning data embeddings in such a way that similar data points are close and dissimilar data points are separated in the embedding space. Although unlabeled data do not have labels, they have geometric information that characterizes domains. The proposed method incorporates this information in a natural way on the basis of a probabilistic framework. The conditional distributions of the latent domain vectors, the embedded data, and the observed data are parameterized by neural networks and are optimized by maximizing the variational lower bound using stochastic gradient descent. The effectiveness of the proposed method was demonstrated through experiments using three clustering tasks.
ISSN:2364-1185
2364-1541
DOI:10.1007/s41019-020-00125-1