This work introduces a randomized topological augmentor based on Schur complements for Graph Contrastive Learning (GCL). The rLap
augmentor is written in C++ (with Python bindings) and uses Eigen for representing sparse matrices which aids in efficient traversal and indexing into matrices. Additionally, the relevant data structures for sampling edges are inspired from the Laplacians.jl effort.
Generalized GCL framework. The augmentor is effective for GCL with varying design choices of encoders and objectives.
The motivation and methodology behind rLap
is presented in my ICML 2023 paper.
@inproceedings{Kothapalli2023RandomizedSC,
title={Randomized Schur Complement Views for Graph Contrastive Learning},
author={Vignesh Kothapalli},
booktitle={International Conference on Machine Learning},
year={2023}
}
# create virtual environment
$ python3.9 -m virtualenv .venv
$ source .venv/bin/activate
# install torch and torch-geometric if not present
$ pip install torch torch-geometric
# install rlap
$ pip install .
The rlap
API exposes a simple torch operation to obtain the randomized schur complement of a graph. A simple example is shown below:
import torch
from torch_geometric.utils import barabasi_albert_graph, to_undirected
import rlap
num_nodes = 100
# prepare a sample graph
edge_index = barabasi_albert_graph(num_nodes=num_nodes, num_edges=num_nodes//2)
# ensure the graph is undirected
edge_index = to_undirected(edge_index=edge_index, num_nodes=num_nodes)
# compute the randomized schur complement
sc_edge_info = rlap.ops.approximate_cholesky(
edge_index=edge_index,
edge_weights=None, # pass the 1d weights tensor (for the edges) if needed
num_nodes=num_nodes,
num_remove=50, # number of nodes to eliminate
o_v="random", # choose from ["random", "degree", "coarsen"]
o_n="asc", # choose from ["asc", "desc", "random"]
)
# obtain the edge_index
sc_edge_index = (torch.Tensor(sc_edge_info[:, :2]).long().t().to(edge_index.device))
# obtain the edge_weights (if necessary)
sc_edge_weights = torch.Tensor(sc_edge_info[:,-1]).t().to(edge_index.device)
The pytorch geometric implementation of the augmentor is based on the PyGCL library for reproducible experiments and is available in augmentor_benchmarks.py
. Additionally, a DGL implementation is made available in CCA-SSG/aug.py
.
To run the following scripts, change the directory to:
$ cd scripts
Use the following shell script to benchmark all the augmentors on node and graph classification datasets
$ bash run_augmentor_benchmarks.sh
Use the following python script to prepare the latex table of benchmark results. The table will be properly filled only when CPU and GPU based benchmarks have completed. Interrupting the previous script to generate the table will lead to parsing errors for incomplete runs.
$ python prepare_augmentor_stats.py
Use the following shell script to run node classification experiments using the GRACE design
$ bash run_node_shared.sh
Use the following shell script to run node classification experiments using the MVGRL design
$ bash run_node_dedicated.sh
Use the following shell script to run graph classification experiments using the GraphCL design
$ bash run_graph_shared.sh
Use the following shell script to run graph classification experiments using the BGRL (g-l) design
$ bash run_graph_shared_g2l.sh
Use the following python script to prepare the latex table of results
$ python prepare_final_stats.py
Use the following shell script to run max singular value and edge count analysis of rlap variants
$ python rlap_vc_spectral.py
Use the following shell script to plot edge counts of randomized schur complements after diffusion
$ python rlap_ppr_edge_plots.py
Please feel free to open issues and create pull requests to fix bugs and improve performance.