This is a repository for an ICML 2023 SPIGM Workshop paper and my Master's Thesis at Skoltech.
Authors: Dmitrii Gavrilev, Evgeny Burnaev (research advisor)
In this project, we use GDSS as a generative model.
Node outlier detection in attributed graphs is a challenging problem for which there is no method that would work well across different datasets. Motivated by the state-of-the-art results of score-based models in graph generative modeling, we propose to incorporate them into the aforementioned problem. Our method achieves competitive results on small-scale graphs. We provide an empirical analysis of the Dirichlet energy, and show that generative models might struggle to accurately reconstruct it.
python run_benchmark.py
trains GDSS with random hyperparameters on a chosen dataset, runs inference with our methods, and repeats this pipeline 20 times. The result of inference is a .npy
file with intermediate calculations.
Arguments:
-
--config
(path to a dataset config) -
--exp_name
(the name of the experiment/checkpoints) -
--radius
(the number of hops in ego-graphs) -
--trajectory_sample
(the number of samples per trajectory;$K$ in the paper) -
--num_sample
(the number of samples per node;$S$ in the paper) -
--num_steps
(the number of steps to denoise for the full time horizon$[0,1]$ ) -
--is_energy
(if True, it will use shift in energy as a graph dissimilary) -
--skip_training
(inference-only mode; assumes the checkpoints already exist)
We evaluate our methods in a notebook by processing intermediate calculations from .npy
files. See an example of training, inference and evaluation in Colab:
(Matrix distance as a dissimilarity measure)
(Shift in energy as a dissimilarity measure)
Optionally, you can download the model checkpoints here.
Unzip them at ./checkpoint/{dataset_name}/
and run the benchmark with the --skip_training True
option.