dig.xgraph.dataset¶
Dataset interfaces under dig.xgraph.dataset
.
The synthetic graph classification dataset used in Higher-Order Explanations of Graph Neural Networks via Relevant Walks. |
|
The extension of MoleculeNet with MUTAG. |
|
The SentiGraph datasets from Explainability in Graph Neural Networks: A Taxonomic Survey. |
|
The Synthetic datasets used in Parameterized Explainer for Graph Neural Network. |
- class BA_LRP(root, num_per_class=10000, transform=None, pre_transform=None)[source]¶
The synthetic graph classification dataset used in Higher-Order Explanations of Graph Neural Networks via Relevant Walks. The first class in
BA_LRP
is Barabási–Albert(BA) graph which connects a new node \(\mathcal{V}\) from current graph \(\mathcal{G}\).\[p(\mathcal{V}) = \frac{Degree(\mathcal{V})}{\sum_{\mathcal{V}' \in \mathcal{G}} Degree(\mathcal{V}')}\]The second class in
BA_LRP
has a slightly higher growth model and nodes are selected without replacement with the inverse preferential attachment model.\[p(\mathcal{V}) = \frac{Degree(\mathcal{V})^{-1}}{\sum_{\mathcal{V}' \in \mathcal{G}} Degree(\mathcal{V}')^{-1}}\]- Parameters
root (
str
) – Root data directory to save datasetsnum_per_class (
int
) – The number of the graphs for each class.transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
Note
BA_LRP
will automatically generate the dataset if the dataset file is not existed in the root directory.Example
>>> dataset = BA_LRP(root='./datasets') >>> loader = Dataloader(dataset, batch_size=32) >>> data = next(iter(loader)) # Batch(batch=[640], edge_index=[2, 1344], x=[640, 1], y=[32, 1])
Where the attributes of data indices:
batch
: The assignment vector mapping each node to its graph indexx
: The node featuresedge_index
: The edge matrixy
: The graph label
- property processed_file_names¶
The name of the files in the
self.processed_dir
folder that must be present in order to skip processing.
- property raw_file_names¶
The name of the files in the
self.raw_dir
folder that must be present in order to skip downloading.
- class MoleculeDataset(root, name, transform=None, pre_transform=None, pre_filter=None)[source]¶
The extension of MoleculeNet with MUTAG.
The MoleculeNet benchmark collection from the MoleculeNet: A Benchmark for Molecular Machine Learning paper, containing datasets from physical chemistry, biophysics and physiology.
The MoleculeNet datasets come with the additional node and edge features introduced by the Open Graph Benchmark, and the node features in MUTAG dataset are one hot features denoting the atom types.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"MUTAG"
,"ESOL"
,"FreeSolv"
,"Lipo"
,"PCBA"
,"MUV"
,"HIV"
,"BACE"
,"BBPB"
,"Tox21"
,"ToxCast"
,"SIDER"
,"ClinTox"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
- property processed_file_names¶
The name of the files in the
self.processed_dir
folder that must be present in order to skip processing.
- property raw_file_names¶
The name of the files in the
self.raw_dir
folder that must be present in order to skip downloading.
- class SentiGraphDataset(root, name, transform=None, pre_transform=<function undirected_graph>)[source]¶
The SentiGraph datasets from Explainability in Graph Neural Networks: A Taxonomic Survey. The datasets take pretrained BERT as node feature extractor and dependency tree as edges to transfer the text sentiment datasets into graph classification datasets.
The dataset Graph-SST2 should be downloaded to the proper directory before running. All the three datasets Graph-SST2, Graph-SST5, and Graph-Twitter can be download in this link.
- Parameters
root (
str
) – Root directory where the datasets are savedname (
str
) – The name of the datasets.transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
Note
The default parameter of pre_transform is
undirected_graph()
which transfers the directed graph in original data into undirected graph before being saved to disk.- property processed_file_names¶
The name of the files in the
self.processed_dir
folder that must be present in order to skip processing.
- property raw_file_names¶
The name of the files in the
self.raw_dir
folder that must be present in order to skip downloading.
- class SynGraphDataset(root, name, transform=None, pre_transform=None)[source]¶
The Synthetic datasets used in Parameterized Explainer for Graph Neural Network. It takes Barabási–Albert(BA) graph or balance tree as base graph and randomly attachs specific motifs to the base graph.
- Parameters
root (
str
) – Root data directory to save datasetsname (
str
) – The name of the dataset. IncludingBA_shapes
, BA_grid,transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (
Callable
,None
) – A function/transform that takes in antorch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
- property processed_file_names¶
The name of the files in the
self.processed_dir
folder that must be present in order to skip processing.
- property raw_file_names¶
The name of the files in the
self.raw_dir
folder that must be present in order to skip downloading.