dig.sslgraph.evaluation

Evaluation interfaces under dig.sslgraph.evaluation.

GraphSemisupervised

The evaluation interface for semi-supervised learning and transfer learning for graph-level tasks with pretraining and finetuning datasets.

GraphUnsupervised

The evaluation interface for unsupervised graph representation learning evaluated with linear classification.

NodeUnsupervised

The evaluation interface for unsupervised graph representation learning evaluated with linear classification.

class GraphSemisupervised(dataset, dataset_pretrain, label_rate=1, loss=<function nll_loss>, epoch_select='test_max', metric='acc', n_folds=10, device=None, **kwargs)[source]

The evaluation interface for semi-supervised learning and transfer learning for graph-level tasks with pretraining and finetuning datasets. You can refer to the benchmark code for examples of usage.

Parameters
  • dataset (torch_geometric.data.Dataset) – The graph dataset for finetuning and evaluation.

  • dataset_pretrain (torch_geometric.data.Dataset) – The graph dataset for pretraining.

  • label_rate (float, optional) – Ratio of labels to use in finetuning dataset. (default: 1)

  • epoch_select (string, optional) – "test_max" or "val_max". (default: "test_max")

  • n_folds (int, optional) – Number of folds for evaluation. (default: 10)

  • device (int, or torch.device, optional) – Device for computation. (default: None)

  • **kwargs (optional) – Training and evaluation configs in setup_train_config().

Examples

>>> dataset, pretrain_dataset = get_dataset("NCI1", "semisupervised")
>>> evaluator = GraphSemisupervised(dataset, pretrain_dataset, device=0)
>>> evaluator.evaluate(model, encoder) # semi-supervised learning
>>> dataset = MoleculeNet("./transfer_data", "HIV")
>>> pretrain_dataset = ZINC("./transfer_data")
>>> evaluator = GraphSemisupervised(dataset, pretrain_dataset, device=0)
>>> evaluator.evaluate(model, encoder) # transfer learning for molecule classification

Note

When using torch_geometric.data.Dataset without our provided get_dataset function, you may need to manually add self-loops before input to evaluator if some view function requires them, such as diffusion.

evaluate(learning_model, encoder, pred_head=None, fold_seed=12345)[source]

Run evaluation with given learning model and encoder(s).

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module, or list) – Trainable pytorch model or list of models.

  • pred_head (torch.nn.Module, optional) – Prediction head. If None, will use linear projection. (default: None)

Return type

(float, float)

Perform grid search on learning rate and epochs in pretraining.

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

  • pred_head (torch.nn.Module, optional) – Prediction head. If None, will use linear projection. (default: None)

  • p_lr_lst (list, optional) – List of learning rate candidates.

  • p_epoch_lst (list, optional) – List of epochs number candidates.

Return type

(float, float, (float, int))

setup_train_config(batch_size=128, p_optim='Adam', p_lr=0.0001, p_weight_decay=0, p_epoch=100, f_optim='Adam', f_lr=0.001, f_weight_decay=0, f_epoch=100)[source]

Method to setup training config.

Parameters
  • batch_size (int, optional) – Batch size for pretraining and inference. (default: 128)

  • p_optim (string, or torch.optim.Optimizer class) – Optimizer for pretraining. (default: "Adam")

  • p_lr (float, optional) – Pretraining learning rate. (default: 0.0001)

  • p_weight_decay (float, optional) – Pretraining weight decay rate. (default: 0)

  • p_epoch (int, optional) – Pretraining epochs number. (default: 100)

  • f_optim (string, or torch.optim.Optimizer class) – Optimizer for finetuning. (default: "Adam")

  • f_lr (float, optional) – Finetuning learning rate. (default: 0.001)

  • f_weight_decay (float, optional) – Finetuning weight decay rate. (default: 0)

  • f_epoch (int, optional) – Finetuning epochs number. (default: 100)

class GraphUnsupervised(dataset, classifier='SVC', log_interval=1, epoch_select='test_max', metric='acc', n_folds=10, device=None, **kwargs)[source]

The evaluation interface for unsupervised graph representation learning evaluated with linear classification. You can refer to the benchmark code for examples of usage.

Parameters
  • dataset (torch_geometric.data.Dataset) – The graph classification dataset.

  • classifier (string, optional) – Linear classifier for evaluation, "SVC" or "LogReg". (default: "SVC")

  • log_interval (int, optional) – Perform evaluation per k epochs. (default: 1)

  • epoch_select (string, optional) – "test_max" or "val_max". (default: "test_max")

  • n_folds (int, optional) – Number of folds for evaluation. (default: 10)

  • device (int, or torch.device, optional) – Device for computation. (default: None)

  • **kwargs (optional) – Training and evaluation configs in setup_train_config().

Examples

>>> encoder = Encoder(...)
>>> model = Contrastive(...)
>>> evaluator = GraphUnsupervised(dataset, log_interval=10, device=0, p_lr = 0.001)
>>> evaluator.evaluate(model, encoder)
evaluate(learning_model, encoder, fold_seed=None)[source]

Run evaluation with given learning model and encoder(s).

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

  • fold_seed (int, optional) – Seed for fold split. (default: None)

Return type

(float, float)

Perform grid search on learning rate and epochs in pretraining.

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

  • p_lr_lst (list, optional) – List of learning rate candidates.

  • p_epoch_lst (list, optional) – List of epochs number candidates.

Return type

(float, float, (float, int))

setup_train_config(batch_size=256, p_optim='Adam', p_lr=0.01, p_weight_decay=0, p_epoch=20, svc_search=True)[source]

Method to setup training config.

Parameters
  • batch_size (int, optional) – Batch size for pretraining and inference. (default: 256)

  • p_optim (string, or torch.optim.Optimizer class) – Optimizer for pretraining. (default: "Adam")

  • p_lr (float, optional) – Pretraining learning rate. (default: 0.01)

  • p_weight_decay (float, optional) – Pretraining weight decay rate. (default: 0)

  • p_epoch (int, optional) – Pretraining epochs number. (default: 20)

  • svc_search (string, optional) – If True, search for hyper-parameter C in SVC. (default: True)

class NodeUnsupervised(full_dataset, train_mask=None, val_mask=None, test_mask=None, classifier='LogReg', metric='acc', device=None, log_interval=1, **kwargs)[source]

The evaluation interface for unsupervised graph representation learning evaluated with linear classification. You can refer to the benchmark code for examples of usage.

Parameters
  • full_dataset (torch_geometric.data.Dataset) – The graph classification dataset.

  • train_mask (Tensor, optional) – Boolean tensor of shape [n_nodes,], indicating nodes for training. Set to None if included in dataset. (default: None)

  • val_mask (Tensor, optional) – Boolean tensor of shape [n_nodes,], indicating nodes for validation. Set to None if included in dataset. (default: None)

  • test_mask (Tensor, optional) – Boolean tensor of shape [n_nodes,], indicating nodes for test. Set to None if included in dataset. (default: None)

  • classifier (string, optional) – Linear classifier for evaluation, "SVC" or "LogReg". (default: "LogReg")

  • log_interval (int, optional) – Perform evaluation per k epochs. (default: 1)

  • device (int, or torch.device, optional) – Device for computation. (default: None)

  • **kwargs (optional) – Training and evaluation configs in setup_train_config().

Examples

>>> node_dataset = get_node_dataset("Cora") # using default train/test split
>>> evaluator = NodeUnsupervised(node_dataset, log_interval=10, device=0)
>>> evaluator.evaluate(model, encoder)
>>> node_dataset = SomeDataset()
>>> # Using your own dataset or with different train/test split
>>> train_mask, val_mask, test_mask = torch.Tensor([...]), torch.Tensor([...]), torch.Tensor([...])
>>> evaluator = NodeUnsupervised(node_dataset, train_mask, val_mask, test_mask, log_interval=10, device=0)
>>> evaluator.evaluate(model, encoder)
evaluate(learning_model, encoder)[source]

Run evaluation with given learning model and encoder(s).

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

Return type

(float, float)

evaluate_multisplits(learning_model, encoder, split_masks)[source]

Run evaluation with given learning model and encoder(s), return averaged scores on multiple different splits.

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

  • split_masks (list, or generator) – A list of generator that contains or yields masks for train, val and test splits.

Return type

float

Example

>>> split_masks = [(train1, val1, test1), (train2, val2, test2), ..., (train20, val20, test20)]

Perform grid search on learning rate and epochs in pretraining.

Parameters
  • learning_model – An object of a contrastive model (sslgraph.method.Contrastive) or a predictive model.

  • encoder (torch.nn.Module) – Trainable pytorch model or list of models.

  • p_lr_lst (list, optional) – List of learning rate candidates.

  • p_epoch_lst (list, optional) – List of epochs number candidates.

Return type

(float, float, (float, int))