Finding mixed memberships in categorical data

Latent class analysis, a fundamental problem in categorical data analysis, often encounters overlapping latent classes that introduce further challenges. This paper presents a solution to this problem by focusing on finding latent mixed memberships of subjects in categorical data with polytomous res...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Author Qing, Huan
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 05.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Latent class analysis, a fundamental problem in categorical data analysis, often encounters overlapping latent classes that introduce further challenges. This paper presents a solution to this problem by focusing on finding latent mixed memberships of subjects in categorical data with polytomous responses. We employ the Grade of Membership (GoM) model, which assigns each subject a membership score in each latent class. To address this, we propose two efficient spectral algorithms for estimating these mixed memberships and other GoM parameters. Our algorithms are based on the singular value decomposition of a regularized Laplacian matrix. We establish their convergence rates under a mild condition on data sparsity. Additionally, we introduce a metric to evaluate the quality of estimated mixed memberships for real-world categorical data and determine the optimal number of latent classes based on this metric. Finally, we demonstrate the practicality of our methods through experiments on both computer-generated and real-world categorical datasets.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ISSN:2331-8422
DOI:10.48550/arxiv.2312.01565