openff.bespokefit.workflows.BespokeWorkflowFactory

model openff.bespokefit.workflows.BespokeWorkflowFactory[source]

Bases: ClassBase

The bespokefit workflow factory which is a template of the settings that will be used to generate the specific fitting schema for each molecule.

Fields
field initial_force_field: str = 'openff_unconstrained-2.0.0.offxml'

The name of the unconstrained force field to use as a starting point for optimization. The force field must be installed with conda/mamba.

field optimizer: Union[str, ForceBalanceSchema] = ForceBalanceSchema(type='ForceBalance', max_iterations=10, job_type='optimize', penalty_type='L1', step_convergence_threshold=0.01, objective_convergence_threshold=0.01, gradient_convergence_threshold=0.01, n_criteria=2, eigenvalue_lower_bound=0.01, finite_difference_h=0.01, penalty_additive=1.0, initial_trust_radius=-0.25, minimum_trust_radius=0.05, error_tolerance=1.0, adaptive_factor=0.2, adaptive_damping=1.0, normalize_weights=False, extras={})

The optimizer that should be used with the targets already set.

field target_templates: List[Union[TorsionProfileTargetSchema, AbInitioTargetSchema, VibrationTargetSchema, OptGeoTargetSchema]] = [TorsionProfileTargetSchema(weight=1.0, reference_data=None, calculation_specification=None, extras={}, type='TorsionProfile', attenuate_weights=True, energy_denominator=1.0, energy_cutoff=10.0)]

Templates for the fitting targets to use as part of the optimization. The reference_data attribute of each schema will be automatically populated by this factory.

field parameter_hyperparameters: List[Union[ProperTorsionHyperparameters, BondHyperparameters, AngleHyperparameters, VdWHyperparameters, ImproperTorsionHyperparameters]] = [ProperTorsionHyperparameters(type='ProperTorsions', priors={'k': 6.0})]

The settings which describe how types of parameters, e.g. the force constant of a bond parameter, should be restrained during the optimisation such as through the inclusion of harmonic priors.

field target_torsion_smirks: Optional[List[str]] = ['[!#1]~[!$(*#*)&!D1:1]-,=;!@[!$(*#*)&!D1:2]~[!#1]']

A list of SMARTS patterns that should be used to identify the bonds within the target molecule to generate bespoke torsions around. Each SMARTS pattern should include two indexed atoms that correspond to the two atoms involved in the central bond. By default bespoke torsion parameters (if requested) will be constructed for all non-terminal ‘rotatable bonds’

field smirk_settings: SMIRKSettings = SMIRKSettings(expand_torsion_terms=True, generate_bespoke_terms=True)

The settings that should be used when generating SMIRKS patterns for this optimization stage.

field fragmentation_engine: Optional[Union[openff.fragmenter.fragment.WBOFragmenter, openff.fragmenter.fragment.PfizerFragmenter]] = WBOFragmenter(functional_groups={'hydrazine': '[NX3:1][NX3:2]', 'hydrazone': '[NX3:1][NX2:2]', 'nitric_oxide': '[N:1]-[O:2]', 'amide': '[#7:1][#6:2](=[#8:3])', 'amide_n': '[#7:1][#6:2](-[O-:3])', 'amide_2': '[NX3:1][CX3:2](=[OX1:3])[NX3:4]', 'aldehyde': '[CX3H1:1](=[O:2])[#6:3]', 'sulfoxide_1': '[#16X3:1]=[OX1:2]', 'sulfoxide_2': '[#16X3+:1][OX1-:2]', 'sulfonyl': '[#16X4:1](=[OX1:2])=[OX1:3]', 'sulfinic_acid': '[#16X3:1](=[OX1:2])[OX2H,OX1H0-:3]', 'sulfinamide': '[#16X4:1](=[OX1:2])(=[OX1:3])([NX3R0:4])', 'sulfonic_acid': '[#16X4:1](=[OX1:2])(=[OX1:3])[OX2H,OX1H0-:4]', 'phosphine_oxide': '[PX4:1](=[OX1:2])([#6:3])([#6:4])([#6:5])', 'phosphonate': '[P:1](=[OX1:2])([OX2H,OX1-:3])([OX2H,OX1-:4])', 'phosphate': '[PX4:1](=[OX1:2])([#8:3])([#8:4])([#8:5])', 'carboxylic_acid': '[CX3:1](=[O:2])[OX1H0-,OX2H1:3]', 'nitro_1': '[NX3+:1](=[O:2])[O-:3]', 'nitro_2': '[NX3:1](=[O:2])=[O:3]', 'ester': '[CX3:1](=[O:2])[OX2H0:3]', 'tri_halide': '[#6:1]([F,Cl,I,Br:2])([F,Cl,I,Br:3])([F,Cl,I,Br:4])'}, scheme='WBO', wbo_options=WBOOptions(method='am1-wiberg-elf10', max_conformers=800, rms_threshold=1.0), threshold=0.03, heuristic='path_length', keep_non_rotor_ring_substituents=False)

The Fragment engine that should be used to fragment the molecule, note that if None is provided the molecules will not be fragmented. By default we use the WBO fragmenter by the Open Force Field Consortium.

field default_qc_specs: List[QCSpec] [Optional]

The default specification (e.g. method, basis) to use when performing any new QC calculations. If multiple specs are provided, each spec will be considered in order until one is found that i) is available based on the installed dependencies, and ii) is compatible with the molecule of interest.

property target_smirks: List[SMIRKSType]

Returns a list of the target smirks types based on the selected hyper parameters.

to_file(file_name: str) None[source]

Export the factory to yaml or json file.

Parameters

file_name (str) – The name of the file the workflow should be exported to, the type is determined from the name.

classmethod from_file(file_name: str)[source]

Build the factory from a model serialised to file.

optimization_schemas_from_molecules(molecules: Union[Molecule, List[Molecule]], processors: Optional[int] = 1) List[BespokeOptimizationSchema][source]

This is the main function of the workflow which takes the general fitting meta-template and generates a specific one for the set of molecules that are passed.

Parameters
  • molecules – The molecule or list of molecules which should be processed by the schema to generate the fitting schema.

  • processors – The number of processors that should be used when building the workflow, this helps with fragmentation which can be quite slow for large numbers of molecules.

optimization_schema_from_molecule(molecule: Molecule, index: int = 0) Optional[BespokeOptimizationSchema][source]

Build an optimization schema from an input molecule this involves fragmentation.

optimization_schemas_from_results(results: Union[TorsionDriveResultCollection, OptimizationResultCollection, BasicResultCollection], combine: bool = False, processors: Optional[int] = 1) List[BespokeOptimizationSchema][source]

Create a set of optimization schemas (one per molecule) from some results.

Here input molecules are turned into tasks and results are updated during the process.

If multiple targets are in the workflow the results will be applied to the correct target other targets can be updated after by calling update with parameters.