Simplifying approach to node classification in Graph Neural Networks

Graph Neural Networks (GNNs) have become one of the indispensable tools to learn from graph-structured data, and their usefulness has been shown in wide variety of tasks. In recent years, there have been tremendous improvements in architecture design, resulting in better performance on various predi...

Full description

Saved in:

Bibliographic Details
Published in	Journal of computational science Vol. 62; p. 101695
Main Authors	Maurya, Sunil Kumar, Liu, Xin, Murata, Tsuyoshi
Format	Journal Article
Language	English
Published	Elsevier B.V 01.07.2022
Subjects	Feature selection Graph Neural Networks Node classification Node classification Feature selection Graph Neural Networks
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Graph Neural Networks (GNNs) have become one of the indispensable tools to learn from graph-structured data, and their usefulness has been shown in wide variety of tasks. In recent years, there have been tremendous improvements in architecture design, resulting in better performance on various prediction tasks. In general, these neural architectures combine node feature aggregation and feature transformation using learnable weight matrix in the same layer. This makes it challenging to analyze the importance of node features aggregated from various hops and the expressiveness of the neural network layers. As different graph datasets show varying levels of homophily and heterophily in features and class label distribution, it becomes essential to understand which features are important for the prediction tasks without any prior information. In this work, we decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance. We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model. Through our experiments, we show that learning certain subsets of these features can lead to better performance on wide variety of datasets. Based on our observations, we introduce several key design strategies for graph neural networks. More specifically, we propose to use softmax as a regularizer and ”soft-selector” of features aggregated from neighbors at different hop distances; and L2-Normalization over GNN layers. Combining these techniques, we present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models in nine benchmark datasets for the node classification task, with remarkable improvements up to 51.1%. Source code available at https://github.com/sunilkmaurya/FSGNN/. •Current Graph Neural Networks (GNNs) have inconsistent performance in homophily and heterophily graphs.•We analyze importance of feature selection over hops with extensive experiments.•With good feature selection strategy, simple NN model is sufficient for high accuracy.•We propose a model FSGNN for node classification task.•FSGNN outperforms SOTA on heterophily datasets and at par on homophily datasets.
ISSN:	1877-7503 1877-7511
DOI:	10.1016/j.jocs.2022.101695