Heterogeneous interpolation on graph

The ogbg-molhiv and ogbg-molpcba datasets are two molecular property prediction datasets of different sizes: ogbg-molhiv (small) and ogbg-molpcba (medium). The task is to predict the target molecular properties as accurately as possible, where the molecular properties are cast as binary labels, e.g, whether a molecule inhibits HIV virus replication or not. Note that some datasets (e.g., ogbg-molpcba) can have multiple tasks, and can contain nan that indicates the corresponding label is not assigned to the molecule. The challenge leaderboard can be checked at: https://ogb.stanford.edu/docs/leader_graphprop/. We apply Heterogeneous interpolation to solve this challenge and this repo contains our code submission. The techniqual report can be checked at ./report/.

Requirements

Install base packages: bash Python>=3.7 Pytorch>=1.9.0 tensorflow>=2.0.0 pytorch_geometric>=1.6.0 ogb>=1.3.2 dgl>=0.5.3 numpy==1.20.3 pandas==1.2.5 scikit-learn==0.24.2 deep_gcns_torch LibAUC

Results on OGB Challenges

Running the default code 10 times, here we present our results on the ogbg-molhiv and ogbg-molpcba dataset. For ogbg-molhiv, as the dataset is relatively small, we use DeepGCN as our backbone.

Dataset	Method	Test AUROC	Validation AUROC	Parameters	Hardware
ogbg-molhiv	DeepGCN+HIG	0.8403±0.0021	0.8176±0.0034	1019408	Tesla V100 (32GB)

For ogbg-molpcba, we use Graphormer as our backbone.

Dataset	Method	Test AP	Validation AP	Parameters	Hardware
ogbg-molpcba	Graphormer+HIG	0.3167±0.0034	0.3252±0.0043	119529665	Tesla V100 (32GB)

Training Process for ogbg-molhiv

The training process has three steps:

Preparation: Extract fingerprints and train Random Forest by following PaddleHelix

python extract_fingerprint.py
python random_forest.py

Pretrain: Jointly Train a DeepGCN model with FingerPrints Model.

python main.py --use_gpu --conv_encode_edge --num_layers 14 --block res+ --gcn_aggr softmax --t 1.0 --learn_t --dropout 0.2 \
            --dataset ogbg-molhiv \
            --loss auroc \
            --optimizer pesg \
            --batch_size 512 \
            --lr 0.1 \
            --gamma 500 \
            --margin 1.0 \
            --weight_decay 1e-5 \
            --random_seed 0 \
            --epochs 300

python finetune.py --use_gpu --conv_encode_edge --num_layers 14 --block res+ --gcn_aggr softmax --t 1.0 --learn_t --dropout 0.2 \
            --dataset ogbg-molhiv \
            --loss auroc \
            --optimizer pesg \
            --batch_size 512 \
            --lr 0.01 \
            --gamma 300 \
            --margin 1.0 \
            --weight_decay 1e-5 \
            --random_seed 0 \
            --epochs 100

Finetune: Finetune the pretrain model got in Step 2 using NodeDrop Augmentation. We offer a pretrain model in ./saved_models/FT.

python finetune_dropnode.py --use_gpu --conv_encode_edge --num_layers 14 --block res+ --gcn_aggr softmax --t 1.0 --learn_t --dropout 0 \
            --dataset ogbg-molhiv \
            --loss auroc \
            --optimizer pesg \
            --batch_size 512 \
            --lr 0.01 \
            --gamma 300 \
            --margin 1.0 \
            --weight_decay 1e-5 \
            --random_seed 0 \
            --epochs 100

We offer one sample log file: log_ft_molhiv_20211202.txt.

##Training Process for ogbg-molpcba

The training process has three steps:

cd ./Graphormer_with_HIG

Pretrain: Pretrain a Graphormer model on dataset PCQM4M.

sh ./examples/ogb-lsc/lsc-pcba.sh

Finetune: Finetune the pretrain model got in Step 1 using NodeDrop Augmentation and KL divergence constraint. We offer a pretrain model in ./exps/tmp.

sh ./examples/ogb/finetune_dropnode.sh

sh ./examples/ogb/finetune_kl.sh

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
DeeperGCN_with_HIG		DeeperGCN_with_HIG
Graphormer_with_HIG		Graphormer_with_HIG
report		report
README.md		README.md
license.txt		license.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heterogeneous interpolation on graph

Requirements

Results on OGB Challenges

Training Process for ogbg-molhiv

Reference

About

Releases

Packages

Languages

License

TencentYoutuResearch/HIG-GraphClassification

Folders and files

Latest commit

History

Repository files navigation

Heterogeneous interpolation on graph

Requirements

Results on OGB Challenges

Training Process for ogbg-molhiv

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages