dig.ggraph.evaluation

Evaluation interfaces under dig.ggraph.evaluation.

ConstPropOptEvaluator

Evaluator for constrained optimization task.

PropOptEvaluator

Evaluator for property optimization task.

RandGenEvaluator

Evaluator for random generation task.

class ConstPropOptEvaluator[source]

Evaluator for constrained optimization task. Metric is the average property improvements, similarities and success rates under the similarity threshold 0.0, 0.2, 0.4, 0.6.

static eval(input_dict)[source]

Run evaluation in constrained optimization task. Compute the average property improvements, similarities and success rates under the similarity threshold 0.0, 0.2, 0.4, 0.6.

Parameters

input_dict (dict) – A python dict with the following items: “mols_0”, “mols_2”, “mols_4”, “mols_6” — the list of optimized molecules under the similarity threshold 0.0, 0.2, 0.4, 0.6, all represented by rdkit Chem.RWMol or Chem.Mol objects; “inp_smiles” — the list of SMILES strings of input molecules to be optimized.

Return type

dict a python dict with the following items: 0, 2, 4, 6 — the metric values under the similarity threshold 0.0, 0.2, 0.4, 0.6. The metric values are given in the form of a tuple (success rate, mean of similarity, standard deviation of similarity, mean of property improvement, standard deviation of property improvement).

class PropOptEvaluator(prop_name='plogp')[source]

Evaluator for property optimization task. Metric is top-3 property scores among generated molecules.

Parameters

prop_name (str) – A string indicating the name of the molecular property, use ‘plogp’ for penalized logP or ‘qed’ for Quantitative Estimate of Druglikeness (QED). (default: plogp)

eval(input_dict)[source]

Run evaluation in property optimization task. Find top-3 molucules which have highest property scores.

Parameters

input_dict (dict) – A python dict with the following items: “mols” — a list of generated molecules reprsented by rdkit Chem.Mol or Chem.RWMol objects.

Return type

dict a python dict with the following items: 1 — information of molecule with the highest property score; 2 — information of molecule with the second highest property score; 3 — information of molecule with the third highest property score. The molecule information is given in the form of a tuple (SMILES string, property score).

class RandGenEvaluator[source]

Evaluator for random generation task. Metric is validity ratio, uniqueness ratio, and novelty ratio (all represented in percentage).

static eval(input_dict)[source]

Run evaluation in random generation task. Compute the validity ratio, uniqueness ratio and novelty ratio of generated molecules (all represented in percentage).

Parameters

input_dict (dict) – A python dict with the following items: “mols” — the list of generated molecules reprsented by rdkit Chem.RWMol or Chem.Mol objects; “train_smiles” — the list of SMILES strings used for training.

Return type

dict a python dict with the following items: “valid_ratio” — validity percentage; “unique_ratio” — uniqueness percentage; “novel_ratio” — novelty percentage.