Fast Graph Neural Tangent Kernel via Kronecker Sketching
Many deep learning tasks have to deal with graphs (e.g., protein structures, social networks, source code abstract syntax trees). Due to the importance of these tasks, people turned to Graph Neural Networks (GNNs) as the de facto method for learning on graphs. GNNs have become widely applied due to...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
04.12.2021
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Many deep learning tasks have to deal with graphs (e.g., protein structures,
social networks, source code abstract syntax trees). Due to the importance of
these tasks, people turned to Graph Neural Networks (GNNs) as the de facto
method for learning on graphs. GNNs have become widely applied due to their
convincing performance. Unfortunately, one major barrier to using GNNs is that
GNNs require substantial time and resources to train. Recently, a new method
for learning on graph data is Graph Neural Tangent Kernel (GNTK) [Du, Hou,
Salakhutdinov, Poczos, Wang and Xu 19]. GNTK is an application of Neural
Tangent Kernel (NTK) [Jacot, Gabriel and Hongler 18] (a kernel method) on graph
data, and solving NTK regression is equivalent to using gradient descent to
train an infinite-wide neural network. The key benefit of using GNTK is that,
similar to any kernel method, GNTK's parameters can be solved directly in a
single step. This can avoid time-consuming gradient descent. Meanwhile,
sketching has become increasingly used in speeding up various optimization
problems, including solving kernel regression. Given a kernel matrix of $n$
graphs, using sketching in solving kernel regression can reduce the running
time to $o(n^3)$. But unfortunately such methods usually require extensive
knowledge about the kernel matrix beforehand, while in the case of GNTK we find
that the construction of the kernel matrix is already $O(n^2N^4)$, assuming
each graph has $N$ nodes. The kernel matrix construction time can be a major
performance bottleneck when the size of graphs $N$ increases. A natural
question to ask is thus whether we can speed up the kernel matrix construction
to improve GNTK regression's end-to-end running time. This paper provides the
first algorithm to construct the kernel matrix in $o(n^2N^3)$ running time. |
---|---|
DOI: | 10.48550/arxiv.2112.02446 |