Skip to content

Experiment different Neural Networks to classify graph nodes on citation networks datasets

Notifications You must be signed in to change notification settings

fedem96/NeuralNetworksOnGraphs

Repository files navigation

Neural Networks on Graphs

Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. [1]

There are many different approaches and algorithms that enable the use of Neural Networks on Graph data, the survey [1] summarizes them. Good papers to read as an introduction to neural networks on graphs are [2] and [3].

This project

In this project, we want to compare four different algorithms:

  • Planetoid (2016) [4]
  • ChebNet (2016) [5]
  • GCN (2017) [6]
  • GAT (2018) [7]

on three different datasets [8]:

  • Citeseer [9]
  • Cora [10]
  • Pubmed Diabetes

reproducing experimental results reported in Tab.2 of [7], i.e. calculating node-classification accuracies.

We use the same data split as in [4, 5, 6, 7], which consists of 20 training nodes per class, 500 validation nodes, and 1000 test nodes: remaining nodes are used during training without their labels, i.e. we are in a semi-supervised setting.

We make 100 runs for each algorithm, changing the seed for the initialization of weights and reporting the average values with their standard deviation.

Besides, we set a fixed seed for weights initialization, and repeat 100 additional runs while changing the random data splits seed.

Experimental results

For each algorithm, we report:

  1. Original results reported in [7]
  2. Our results, 100 runs: same data splits as in [4, 5, 6, 7], changing seeds for weights initialization
  3. Our results, 100 runs: fixed weights initialization seed, changing seeds for data splits generation
Method Cora Citeseer Pubmed
Planetoid1 75.7% 64.7% 77.2%
Planetoid2 73.1 ± 0.9% 62.3 ± 1.1% 73.6 ± 0.7%
Planetoid3 72.3 ± 2.0% 59.4 ± 2.0% 69.7 ± 2.8%
ChebNet1 81.2% 69.8% 74.4%
ChebNet2 82.0 ± 0.6% 70.5 ± 0.7% 75.2 ± 1.8%
ChebNet3 78.9 ± 1.8% 68.2 ± 1.9% 73.4 ± 2.4%
GCN1 81.5% 70.3% 79.0%
GCN2 80.6 ± 0.6% 68.7 ± 0.9% 78.3 ± 0.5%
GCN3 79.2 ± 1.7% 68.0 ± 1.8% 76.2 ± 2.5%
GAT1 83.0 ± 0.7% 72.5 ± 0.7% 79.0 ± 0.3%
GAT2 83.1 ± 0.4% 71.7 ± 0.7% 77.7 ± 0.4%
GAT3 81.0 ± 1.7% 69.7 ± 1.7% 77.4 ± 2.4%

When changing data splits, average accuracies are lower and with a higher standard deviation, compared to when the data split is fixed to be the same as [4, 5, 6, 7].

t-SNE

t-SNE plots of the hidden features obtained with trained models on the Cora dataset. Points represent graph nodes, whereas different colors represent different class labels.

Planetoid-T ChebNet
GCN GAT

Reproducing experiments

Download repository

$ git clone https://github.com/fedem96/NeuralNetworksOnGraphs.git
$ cd NeuralNetworksOnGraphs

Install dependecies

$ pip install -r requirements.txt

Download (original) datasets

$ mkdir data

Citeseer

$ wget https://linqs-data.soe.ucsc.edu/public/lbc/citeseer.tgz
$ tar -xf citeseer.tgz -C data

Cora

$ wget https://linqs-data.soe.ucsc.edu/public/lbc/cora.tgz
$ tar -xf cora.tgz -C data

Pubmed

$ wget https://linqs-data.soe.ucsc.edu/public/Pubmed-Diabetes.tgz
$ tar -xf Pubmed-Diabetes.tgz -C data
$ mv data/Pubmed-Diabetes data/pubmed

Run the experiments

To reproduce the experiments please refers to this.

References

[1] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang and P. S. Yu. A Comprehensive Survey on Graph Neural Networks (2019)
[2] P. Frasconi, M. Gori and A. Sperduti. A general framework for adaptive processing of data structures (1998)
[3] M. Gori, G. Monfardini and F. Scarselli. A new model for learning in graph domains (2005)
[4] Z. Yang, W. Cohen and R. Salakhudinov. Revisiting semi-supervised learning with graph embeddings (2016)
[5] M. Defferrard, X. Bresson and P. Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering (2016)
[6] T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks (2017)
[7] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio and Y. Bengio. Graph attention networks (2017)
[8] Datasets: https://linqs.soe.ucsc.edu/data
[9] C. L. Giles, K. Bollacker and S. Lawrence. Citeseer: An automatic citation indexing system (1998)
[10] A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning (2000)

About

Experiment different Neural Networks to classify graph nodes on citation networks datasets

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages