Rates of convergence for Laplacian semi-supervised learning with low labeling rates

We investigate graph-based Laplacian semi-supervised learning at low labeling rates (ratios of labeled to total number of data points) and establish a threshold for the learning to be well posed. Laplacian learning uses harmonic extension on a graph to propagate labels. It is known that when the num...

Full description

Saved in:

Bibliographic Details
Published in	Research in the mathematical sciences Vol. 10; no. 1
Main Authors	Calder, Jeff, Slepčev, Dejan, Thorpe, Matthew
Format	Journal Article
Language	English
Published	Cham Springer International Publishing 01.03.2023 Springer Nature B.V
Subjects	Applications of Mathematics Calculus of variations Computational Mathematics and Numerical Analysis Convergence Data points Infinity Labeling Labels Laplace equation Mathematics Mathematics and Statistics PDEs for Machine Learning Random walk Regularization Semi-supervised learning Well posed problems 35J20 60J20 60G50 Regression Random walks on graphs 49J45 49J55 Semi-supervised learning Asymptotic consistency 62G20 Gamma-convergence PDEs on graphs Non-local variational problems
Online Access	Get full text

Cover

Loading…

More Information
Summary:	We investigate graph-based Laplacian semi-supervised learning at low labeling rates (ratios of labeled to total number of data points) and establish a threshold for the learning to be well posed. Laplacian learning uses harmonic extension on a graph to propagate labels. It is known that when the number of labeled data points is finite while the number of unlabeled data points tends to infinity, the Laplacian learning becomes degenerate and the solutions become roughly constant with a spike at each labeled data point. In this work, we allow the number of labeled data points to grow to infinity as the total number of data points grows. We show that for a random geometric graph with length scale ε > 0 , if the labeling rate β ≪ ε 2 , then the solution becomes degenerate and spikes form. On the other hand, if β ≫ ε 2 , then Laplacian learning is well-posed and consistent with a continuum Laplace equation. Furthermore, in the well-posed setting we prove quantitative error estimates of O ( ε β - 1 / 2 ) for the difference between the solutions of the discrete problem and continuum PDE, up to logarithmic factors. We also study p -Laplacian regularization and show the same degeneracy result when β ≪ ε p . The proofs of our well-posedness results use the random walk interpretation of Laplacian learning and PDE arguments, while the proofs of the ill-posedness results use Γ -convergence tools from the calculus of variations. We also present numerical results on synthetic and real data to illustrate our results.
ISSN:	2522-0144 2197-9847
DOI:	10.1007/s40687-022-00371-x