
Utilities under dig.ggraph.utils.


Reward that consists of log p penalized by SA and # long cycles, as described in (Kusner et al. 2017).


Checks the chemical validity of the mol object.


Checks that no atoms in the mol have exceeded their possible valency.


Converts radical electrons in a molecule into bonds to hydrogens.


Construct molecules from the node tensors and adjacency tensors generated by one-shot molecular graph generation methods.


Reward for a target molecule similarity, based on tanimoto similarity between the ECFP fingerprints of the x molecule and target molecule.


Flags molecules based on a steric energy cutoff after max_num_iters iterations of MMFF94 forcefield minimization.


Flags molecules based on problematic functional groups as provided set of ZINC rules from http://blaster.docking.org/filtering/rules_default.txt.


Reward that consists of log p penalized by SA and # long cycles, as described in (Kusner et al. 2017). Scores are normalized based on the statistics of 250k_rndm_zinc_drugs_clean.smi dataset.


mol – Rdkit mol object

Return type



Checks the chemical validity of the mol object. Existing mol object is not modified. Radicals pass this test.


mol – Rdkit mol object

Return type

bool, True if chemically valid, False otherwise


Checks that no atoms in the mol have exceeded their possible valency.


mol – Rdkit mol object

Return type

bool, True if no valency issues, False otherwise


Converts radical electrons in a molecule into bonds to hydrogens. Only use this if molecule is valid. Return a new mol object.


mol – Rdkit mol object

Return type

Rdkit mol object

gen_mol_from_one_shot_tensor(adj, x, atomic_num_list, correct_validity=True, largest_connected_comp=True)[source]

Construct molecules from the node tensors and adjacency tensors generated by one-shot molecular graph generation methods.

  • adj (Tensor) – The adjacency tensor with shape [number of samples, number of possible bond types, maximum number of atoms, maximum number of atoms].

  • x (Tensor) – The node tensor with shape [number of samples, number of possible atom types, maximum number of atoms].

  • atomic_num_list (list) – A list to specify what atom each channel of the 2nd dimension of :obj: x corresponds to.

  • correct_validity (bool, optional) – Whether to use the validity correction introduced by the paper MoFlow: an invertible flow model for generating molecular graphs. (default: True)

  • largest_connected_comp (bool, optional) – Whether to use the largest connected component as the final molecule in the validity correction.(default: True)

Return type

A list of rdkit mol object. The length of the list is number of samples.


>>> adj = torch.rand(2, 4, 38, 38)
>>> x = torch.rand(2, 10, 38)
>>> atomic_num_list = [6, 7, 8, 9, 15, 16, 17, 35, 53, 0]
>>> gen_mols = gen_mol_from_one_shot_tensor(adj, x, atomic_num_list)
>>> gen_mols
[<rdkit.Chem.rdchem.Mol>, <rdkit.Chem.rdchem.Mol>]
reward_target_molecule_similarity(mol, target, radius=2, nBits=2048, useChirality=True)[source]

Reward for a target molecule similarity, based on tanimoto similarity between the ECFP fingerprints of the x molecule and target molecule.

  • mol – Rdkit mol object

  • target – Rdkit mol object

Return type

float, [0.0, 1.0]

steric_strain_filter(mol, cutoff=0.82, max_attempts_embed=20, max_num_iters=200)[source]

Flags molecules based on a steric energy cutoff after max_num_iters iterations of MMFF94 forcefield minimization. Cutoff is based on average angle bend strain energy of molecule

  • mol – Rdkit mol object

  • cutoff (float, optional) – Kcal/mol per angle . If minimized energy is above this threshold, then molecule fails the steric strain filter. (default: 0.82)

  • max_attempts_embed (int, optional) – Number of attempts to generate initial 3d coordinates. (default: 20)

  • max_num_iters (int, optional) – Number of iterations of forcefield minimization. (default: 200)

Return type

bool, True if molecule could be successfully minimized, and resulting energy is below cutoff, otherwise False.


Flags molecules based on problematic functional groups as provided set of ZINC rules from http://blaster.docking.org/filtering/rules_default.txt.


mol – Rdkit mol object

Return type

bool, returns True if molecule is okay (ie does not match any of therules), False if otherwise.