Interpretation of Structural Preservation in Low-Dimensional Embeddings

Despite being commonly used in big-data analytics; the outcome of dimensionality reduction remains a black-box to most of its users. Understanding the quality of a low-dimensional embedding is important as not only it enables trust in the transformed data, but it can also help to select the most app...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 34; no. 5; pp. 2227 - 2240
Main Authors	Ghosh, Aindrila, Nashaat, Mona, Miller, James, Quader, Shaikh
Format	Journal Article
Language	English
Published	New York IEEE 01.05.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms algorithms for data and knowledge management Approximation algorithms Bridges data and knowledge visualization Data visualization Datasets Dimensionality reduction Embedding Interactive data exploration and discovery Manifolds Optimization Reduction Visualization
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Despite being commonly used in big-data analytics; the outcome of dimensionality reduction remains a black-box to most of its users. Understanding the quality of a low-dimensional embedding is important as not only it enables trust in the transformed data, but it can also help to select the most appropriate dimensionality reduction algorithm in a given scenario. As existing research primarily focuses on the visual exploration of embeddings, there is still a need for enhancing interpretability of such algorithms. To bridge this gap, we propose two novel interactive explanation techniques for low-dimensional embeddings obtained from any dimensionality reduction algorithm. The first technique LAPS produces a local approximation of the neighborhood structure to generate interpretable explanations on the preserved locality for a single instance. The second method GAPS explains the retained global structure of a high-dimensional dataset in its embedding, by combining non-redundant local-approximations from a coarse discretization of the projection space. We demonstrate the applicability of the proposed techniques using 16 real-life tabular, text, image, and audio datasets. Our extensive experimental evaluation shows the utility of the proposed techniques in interpreting the quality of low-dimensional embeddings, as well as with selecting the most suitable dimensionality reduction algorithm for any given dataset.
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2020.3005878