Modifying a SMIRNOFF force field

In this example, we’ll parameterize a ligand automatically, and then play with its parameters to demonstrate the Toolkit’s ability to facilitate force field optimization. For each modification, we’ll calculate the energy for the original conformation, and then minimize the energy and visualize the result.

from copy import deepcopy

from openff.units import unit
from openff.units.openmm import from_openmm, to_openmm
from openmm import LangevinIntegrator, app
from openmm import unit as openmm_unit

import openff.toolkit.typing.engines.smirnoff.parameters as offtk_parameters
from openff.toolkit import ForceField, Molecule

We’re going to do a lot of changing a parameter and then visualising what happened, so let’s define a convenience function to do just that

def minimize_and_visualize(molecule, forcefield):
    # Sort out our input data
    mol_topology = molecule.to_topology()
    mol_system = forcefield.create_openmm_system(
        mol_topology,
        charge_from_molecules=[molecule],
    )

    # Set up the minimization and point calculation
    integrator = LangevinIntegrator(
        300 * openmm_unit.kelvin,
        1 / openmm_unit.picosecond,
        0.002 * openmm_unit.picoseconds,
    )
    simulation = app.Simulation(mol_topology.to_openmm(), mol_system, integrator)
    simulation.context.setPositions(to_openmm(molecule.conformers[0]))

    # Get the initial energy
    initial_potential = simulation.context.getState(getEnergy=True).getPotentialEnergy()

    # Energy minimize
    simulation.minimizeEnergy()
    minimized_state = simulation.context.getState(getPositions=True, getEnergy=True)
    minimized_potential = minimized_state.getPotentialEnergy()
    minimized_coords = from_openmm(minimized_state.getPositions(asNumpy=True))

    # Visualize
    vis_mol = deepcopy(molecule)
    vis_mol.conformers[0] = minimized_coords
    view = vis_mol.visualize(backend="nglview")
    print(
        f"Initial energy is {initial_potential.format('%0.1F')};",
        f"Minimized energy is {minimized_potential.format('%0.1F')}",
    )
    return view

Getting to know you — the molecule

This “ligand” is a modified version of the molecule we introduced in the Toolkit Showcase. It’s just been altered for a slightly more exciting example here. This also lets us demonstrate constructing a Molecule from a SMILES string!

ligand_smiles = "CC(C)(C)c1c(O)c(O)c2c(c1O)[C@H]1OCCC[C@H]1[C@H](c1cc(O)c(O)c(F)c1)N2"
ligand = Molecule.from_smiles(ligand_smiles)
ligand.generate_conformers(n_conformers=1)
force_field = ForceField("openff-2.1.0.offxml")

Computing charges is expensive, and we’re going to be changing the force field a lot, so we can save time by computing them just once and caching them.

ℹ️ Note that when we call create_openmm_system above we pass in the charges with the charge_from_molecules argument!
ligand.partial_charges = force_field.get_partial_charges(ligand)

The ligand, visualised

Let’s take a close look at the ligand and decide what we want to modify. We’ll label the atoms with their indexes so we can identify them later

view = minimize_and_visualize(ligand, force_field)
view.add_label(label_type="atomindex", color="black", attachment="middle-center")
view
Initial energy is 36.7 kJ/mol; Minimized energy is -311.5 kJ/mol

Getting to know all about you — Investigating assigned parameters

Let’s start with something simple — lengthening the bond to the fluorine. We can use ForceField.label_molecules to identify the appropriate parameters:

ff_applied_parameters = force_field.label_molecules(ligand.to_topology())
ff_applied_parameters
[{'Constraints': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789bd5cc0>,
  'Bonds': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789ab3d90>,
  'Angles': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789aa4310>,
  'ProperTorsions': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789c40dc0>,
  'ImproperTorsions': <openff.toolkit.topology.topology.ImproperDict at 0x7f7789b86da0>,
  'vdW': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789a8cac0>,
  'Electrostatics': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789a8d960>,
  'LibraryCharges': {},
  'ToolkitAM1BCC': <openff.toolkit.topology.topology.ValenceDict at 0x7f7789a8c550>}]

label_molecules() returns a list of the molecules that can be parameterized from the arguments. It returns them as dictionaries that give us access to the ForceField parameters that are used for the molecule. Since we’re only passing in a single molecule, ff_applied_parameters is a list of one element. We can see the bonds that are used for the ligand by converting the appropriate ValenceDict to a regular dictionary:

dict(ff_applied_parameters[0]["Bonds"])
{(0,
  1): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (0,
  30): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (0,
  31): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (0,
  32): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (1,
  2): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (1,
  3): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (1,
  4): <BondType with smirks: [#6X4:1]-[#6X3:2]  id: b2  length: 1.503434271105 angstrom  k: 484.1959214883 kilocalorie / angstrom ** 2 / mole  >,
 (2,
  33): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (2,
  34): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (2,
  35): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (3,
  36): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (3,
  37): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (3,
  38): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (4,
  5): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (4,
  11): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (5,
  6): <BondType with smirks: [#6X3:1]-[#8X2H1:2]  id: b18  length: 1.367997231102 angstrom  k: 673.9493155918 kilocalorie / angstrom ** 2 / mole  >,
 (5,
  7): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (6,
  39): <BondType with smirks: [#8:1]-[#1:2]  id: b88  length: 0.981124525388 angstrom  k: 1069.809209734 kilocalorie / angstrom ** 2 / mole  >,
 (7,
  8): <BondType with smirks: [#6X3:1]-[#8X2H1:2]  id: b18  length: 1.367997231102 angstrom  k: 673.9493155918 kilocalorie / angstrom ** 2 / mole  >,
 (7,
  9): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (8,
  40): <BondType with smirks: [#8:1]-[#1:2]  id: b88  length: 0.981124525388 angstrom  k: 1069.809209734 kilocalorie / angstrom ** 2 / mole  >,
 (9,
  10): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (9,
  29): <BondType with smirks: [#6X3:1]-[#7X3:2]  id: b8  length: 1.389681126838 angstrom  k: 640.6150893356 kilocalorie / angstrom ** 2 / mole  >,
 (10,
  11): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (10,
  13): <BondType with smirks: [#6X4:1]-[#6X3:2]  id: b2  length: 1.503434271105 angstrom  k: 484.1959214883 kilocalorie / angstrom ** 2 / mole  >,
 (11,
  12): <BondType with smirks: [#6X3:1]-[#8X2H1:2]  id: b18  length: 1.367997231102 angstrom  k: 673.9493155918 kilocalorie / angstrom ** 2 / mole  >,
 (12,
  41): <BondType with smirks: [#8:1]-[#1:2]  id: b88  length: 0.981124525388 angstrom  k: 1069.809209734 kilocalorie / angstrom ** 2 / mole  >,
 (13,
  14): <BondType with smirks: [#6X4:1]-[#8X2H0:2]  id: b16  length: 1.421832315661 angstrom  k: 434.3139352817 kilocalorie / angstrom ** 2 / mole  >,
 (13,
  18): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (13,
  42): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (14,
  15): <BondType with smirks: [#6X4:1]-[#8X2H0:2]  id: b16  length: 1.421832315661 angstrom  k: 434.3139352817 kilocalorie / angstrom ** 2 / mole  >,
 (15,
  16): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (15,
  43): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (15,
  44): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (16,
  17): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (16,
  45): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (16,
  46): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (17,
  18): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (17,
  47): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (17,
  48): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (18,
  19): <BondType with smirks: [#6X4:1]-[#6X4:2]  id: b1  length: 1.527940216866 angstrom  k: 419.9869268191 kilocalorie / angstrom ** 2 / mole  >,
 (18,
  49): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (19,
  20): <BondType with smirks: [#6X4:1]-[#6X3:2]  id: b2  length: 1.503434271105 angstrom  k: 484.1959214883 kilocalorie / angstrom ** 2 / mole  >,
 (19,
  29): <BondType with smirks: [#6:1]-[#7:2]  id: b7  length: 1.46420197713 angstrom  k: 457.1029448115 kilocalorie / angstrom ** 2 / mole  >,
 (19,
  50): <BondType with smirks: [#6X4:1]-[#1:2]  id: b84  length: 1.090139506109 angstrom  k: 719.6424928981 kilocalorie / angstrom ** 2 / mole  >,
 (20,
  21): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (20,
  28): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (21,
  22): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (21,
  51): <BondType with smirks: [#6X3:1]-[#1:2]  id: b85  length: 1.081823673944 angstrom  k: 775.3853383846 kilocalorie / angstrom ** 2 / mole  >,
 (22,
  23): <BondType with smirks: [#6X3:1]-[#8X2H1:2]  id: b18  length: 1.367997231102 angstrom  k: 673.9493155918 kilocalorie / angstrom ** 2 / mole  >,
 (22,
  24): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (23,
  52): <BondType with smirks: [#8:1]-[#1:2]  id: b88  length: 0.981124525388 angstrom  k: 1069.809209734 kilocalorie / angstrom ** 2 / mole  >,
 (24,
  25): <BondType with smirks: [#6X3:1]-[#8X2H1:2]  id: b18  length: 1.367997231102 angstrom  k: 673.9493155918 kilocalorie / angstrom ** 2 / mole  >,
 (24,
  26): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (25,
  53): <BondType with smirks: [#8:1]-[#1:2]  id: b88  length: 0.981124525388 angstrom  k: 1069.809209734 kilocalorie / angstrom ** 2 / mole  >,
 (26,
  27): <BondType with smirks: [#6:1]-[#9:2]  id: b68  length: 1.351036117403 angstrom  k: 710.1945186755 kilocalorie / angstrom ** 2 / mole  >,
 (26,
  28): <BondType with smirks: [#6X3:1]:[#6X3:2]  id: b5  length: 1.394445702699 angstrom  k: 765.1465671607 kilocalorie / angstrom ** 2 / mole  >,
 (28,
  54): <BondType with smirks: [#6X3:1]-[#1:2]  id: b85  length: 1.081823673944 angstrom  k: 775.3853383846 kilocalorie / angstrom ** 2 / mole  >,
 (29,
  55): <BondType with smirks: [#7:1]-[#1:2]  id: b87  length: 1.022553377106 angstrom  k: 964.6719203843 kilocalorie / angstrom ** 2 / mole  >}

Take a look at the first entry as an example. The dictionary is keyed by the atomic indices of the particles in the molecule that the parameter applies to, and the values are special types that specify the parameters from the force field. We’ll look in more detail at one of these parameters in a second.

A little goes a long way — Changing parameters

We already decided we wanted to adjust the torsion parameter to the hydroxyl group, so let’s do that. We know parameters are indexed by the atoms they’ve been applied to, so we can look at our labelled widget above and pull out exactly the parameter we care about:

fluorine = [atom for atom in ligand.atoms if atom.symbol == "F"][0]
carbon = [neighbor_atom for neighbor_atom in fluorine.bonded_atoms][0]
fluorine_carbon_bond_indices = (
    fluorine.molecule_atom_index,
    carbon.molecule_atom_index,
)
print("The fluorine-carbon bond is between atoms", fluorine_carbon_bond_indices)

ff_applied_parameters[0]["Bonds"][fluorine_carbon_bond_indices]
The fluorine-carbon bond is between atoms (27, 26)
<BondType with smirks: [#6:1]-[#9:2]  id: b68  length: 1.351036117403 angstrom  k: 710.1945186755 kilocalorie / angstrom ** 2 / mole  >
📗 We have to get the indices of these atoms out programmatically because our SMILES code doesn't specify atom indices, and so you might have different indices to us. Take a look at the widget above and try manually specifying atom indices to get the parameters of a bond!

Let’s dig into this type a bit more. It has a few attributes in its textual representation. The first of these is maybe the most important: the smirks attribute, which tells the Toolkit which atoms this parameter applies to. SMIRKS is a chemical pattern matching format; think of it as the result of a regular expression having a baby with a SMILES string. This one is very simple: it is just a Carbon atom ([#6:1], atomic number 6) singly bonded (-) to a Fluorine atom ([#9:2], atomic number 9). The numbers after the colons just label the atoms — this is a bond, so it has two atoms labelled. This is helpful for when we want to match against atoms that aren’t a part of the parameter; we just don’t label the additional atoms. We’ll use this trick later.

The other important attributes provide the actual parameterization, and so are different for different kinds of parameters. For proper torsions, this is the periodicity of the sinewave that describes the torsion, as well as its phase and amplitude (or force constant, \(k\)). These parameter values are similar to the equivalent description in most other force field formats.

ℹ️ The SMIRKS specification is available online:

https://www.daylight.com/dayhtml/doc/theory/theory.smirks.html

It is closely related to the SMARTS molecular pattern matching language, whose specification is probably more useful for working with the Toolkit and is also available online:

https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html

Modifying a parameter

Unfortunately we can’t just modify this parameter and see the results reflected in the parameterization. We need to get the appropriate parameter from the force field and modify it there.

fluorine_bond = force_field["Bonds"].parameters["[#6:1]-[#9:2]"]
fluorine_bond.length = 10 * unit.angstrom

Here, we’ve selected the proper torsion with the SMIRKS code we found earlier, and changed its force constant by an order of magnitude in the opposite direction! Let’s see what we’ve wrought:

minimize_and_visualize(ligand, force_field)
Initial energy is 111057.8 kJ/mol; Minimized energy is -296.5 kJ/mol

Turns out Pinocchio was a real molecule all along!

Parameters affecting multiple atom groups

Ok, that was fun, but it’s only one parameter; we could easily have made this change in the OpenMM System or a GROMACS ITP file or whatever. What’s the toolkit really giving us here?

Let’s mess with all the H-X-H angles. And let’s not get into SMIRKS this time, lets let the toolkit do the thinking:

ff_applied_parameters = force_field.label_molecules(ligand.to_topology())
for atoms, parameter in ff_applied_parameters[0]["Angles"].items():
    ele_1 = ligand.atoms[atoms[0]].symbol
    ele_2 = ligand.atoms[atoms[1]].symbol
    ele_3 = ligand.atoms[atoms[2]].symbol
    if ele_1 == "H" and ele_3 == "H":
        print(atoms, parameter)
(30, 0, 31) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(30, 0, 32) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(31, 0, 32) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(33, 2, 34) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(33, 2, 35) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(34, 2, 35) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(36, 3, 37) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(36, 3, 38) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(37, 3, 38) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(43, 15, 44) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(45, 16, 46) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >
(47, 17, 48) <AngleType with smirks: [#1:1]-[#6X4:2]-[#1:3]  angle: 108.5839257083 degree  k: 75.08254435747 kilocalorie / mole / radian ** 2  id: a2  >

Wow, there’s just one smirks parameter describing all of the H-X-H angles!

ℹ️ It's worth mentioning here that a force field might include multiple parameters with SMIRKS that match a particular group of atoms. When this happens, only the last parameter specified is applied. This allows force field authors to define general parameters first, and then override them with more specific parameters.
📗 Convince yourself that this SMIRKS code matches all of the H-X-H angles in our molecule. What H-X-H angles would it not match?
hxh_angle = force_field["Angles"].parameters["[#1:1]-[#6X4:2]-[#1:3]"]
hxh_angle.angle = 50 * unit.degree

minimize_and_visualize(ligand, force_field)
Initial energy is 113014.4 kJ/mol; Minimized energy is -111.0 kJ/mol

That is not how a methyl group is supposed to look. Guess we did it right!

Introducing a new parameter

Now let’s get really crazy. I’ve always thought heteroatoms in rings looked a bit too comfortable, haven’t you?

Since we have two cyclic heteroatoms in our molecule with very different chemistries, there’ll probably be two seperate parameters we need to modify. We can just pass the SMIRKS that come up for C-X-C angles to code that modifies the parameters directly, no need to manually copy it over.

ff_applied_parameters = force_field.label_molecules(ligand.to_topology())
for atoms, parameter in ff_applied_parameters[0]["Angles"].items():
    ele_1 = ligand.atoms[atoms[0]].symbol
    ele_2 = ligand.atoms[atoms[1]].symbol
    ele_3 = ligand.atoms[atoms[2]].symbol
    if (ele_2 not in ["C", "H"]) and ele_1 == "C" and ele_3 == "C":
        print(atoms, parameter)

        smirks = parameter.smirks
        heteroatom_angle = force_field["Angles"].parameters[smirks]
        # dihedral angles are undefined when linear, so let's not set to 180
        heteroatom_angle.angle = 179 * unit.degree

        # We'll need this later
        if ele_2 == "N":
            cyclic_nitrogen_smirks = smirks

minimize_and_visualize(ligand, force_field)
(9, 29, 19) <AngleType with smirks: [*:1]~[#7X3$(*~[#6X3,#6X2,#7X2+0]):2]~[*:3]  angle: 119.1748379248 degree  k: 151.142556131 kilocalorie / mole / radian ** 2  id: a20  >
(13, 14, 15) <AngleType with smirks: [*:1]-[#8:2]-[*:3]  angle: 111.9874516071 degree  k: 237.851218935 kilocalorie / mole / radian ** 2  id: a28  >
Initial energy is 116251.4 kJ/mol; Minimized energy is 160.9 kJ/mol

Oops! We didn’t mean to make all the hydroxyl groups linear! One of our parameters must be applied to Oxygen both when its in a ring and in a hydroxyl group. Sure enough, the second SMIRKS code printed above ([*:1]-[#8:2]-[*:3]) matches any X-O-X angle! We’ll have to define a new, more specific SMIRKS entry so we only capture the appropriate oxygen.

The SMIRKS for our new parameter

Our modified force field is getting a bit messy, so let’s start again. We still want the nitrogen angle from before, so lets save that too

force_field = ForceField("openff-2.1.0.offxml")
cyclic_nitrogen_angle = force_field["Angles"].parameters[cyclic_nitrogen_smirks]
cyclic_nitrogen_angle.angle = 179 * unit.degree

Now we need to define a new SMIRKS entry, specific to oxygens in a ring. We can be as specific as we like, as our new parameter will be at the end of the force field and so will override any conflicting parameters. Let’s define our smirks to match any singly-bonded C-O-C angle in a six-membered pyran ring.

First, we’ll write down the SMILES for a pyran:

C1CCOCC1

The first carbon has a digit following it, C1, which allows us to close the ring later. Then we have two more carbons, CC, an oxygen, O, and the final two carbons, CC. Finally, we close the ring by repeating the same digit from before, 1. Lets check that this produces the chemical we expect before we move on:

Molecule.from_smiles("C1CCOCC1").visualize()
../../../../_images/5a3591a33c17cbec32b1889a969299944f79bd68aaa95e63fa7770ed08815950.svg

Perfect. We can ignore the hydrogens, as we haven’t specified any bond orders - our smirks will match the saturated structure or any of its derivatives that maintain the ring. Now all we have to do is label the atoms we want to extract the bond from! This way, all the atoms must be present for the SMIRKS to match, but only these three atoms define the angle for the parameter. This lets us be more specific than we could be if we could only describe the atoms in the angle itself. We do that the same way we did before, put them in square brackets and append a colon and a number:

cyclic_oxygen_smirks = "C1C[C:1]-[O:2]-[C:3]C1"
📗 Try replacing this SMIRKS code with one of your own. Hint: For this force field and ligand, there's a much simpler code that will uniquely specify the C-O-C bond!

Defining and registering the new parameter

Now, all that’s left is to define the angle parameter, add it to the force field, and see what we’ve wrought!

# Define the new angle parameter
angle_parameter = offtk_parameters.AngleHandler.AngleType(
    smirks=cyclic_oxygen_smirks,
    angle=179 * unit.degree,
    k=134.5019777341 * unit.kilocalorie / (unit.mole * unit.radians**2),
)

# Add the parameter to the force field
angles_handler = force_field["Angles"]
angles_handler.add_parameter(parameter=angle_parameter)

# Visualize the newly parameterized molecule
minimize_and_visualize(ligand, force_field)
Initial energy is 701.1 kJ/mol; Minimized energy is -88.8 kJ/mol

Perfect! The two cyclic heteroatoms are nearly linear while the hydroxyl group retains its characteristic angle.

Bonus: Adding a cosmetic attribute to the new parameter

In some cases, you might want to tag a particular attribute with extra data. This might be useful, for example, in communicating to a fitting tool which parameters should be fitted, or simply to add extra metadata… The toolkit allows for cosmetic attributes, which can be stored to individual parameters, will be included out when the ForceField object is written to disk, but will not otherwise affect the parameterization machinery. The API point for this is ParameterType.add_cosmetic_attribute.

Let’s use this to tag our modified and new parameters with a note about why we introduced it.

angles_handler.parameters[cyclic_nitrogen_smirks].add_cosmetic_attribute(
    attr_name="note",
    attr_value="funky pseudo-linear cyclic nitrogen",
)

angles_handler.parameters[cyclic_oxygen_smirks].add_cosmetic_attribute(
    attr_name="note",
    attr_value="funky pseudo-linear cyclic oxygen",
)

We we look at these parameter now, they include this attribute tacked on at the end:

angles_handler.parameters[cyclic_nitrogen_smirks]
<AngleType with smirks: [*:1]~[#7X3$(*~[#6X3,#6X2,#7X2+0]):2]~[*:3]  angle: 179 degree  k: 151.142556131 kilocalorie / mole / radian ** 2  id: a20  note: funky pseudo-linear cyclic nitrogen  >
angles_handler.parameters[cyclic_oxygen_smirks]
<AngleType with smirks: C1C[C:1]-[O:2]-[C:3]C1  angle: 179 degree  k: 134.5019777341 kilocalorie / mole / radian ** 2  note: funky pseudo-linear cyclic oxygen  >

Cometic attributes are included by default when writing to disk. (This can be turned off by flipping the argument discard_cosmetic_attributes to True.) Let’s write out our modified force field and search through the file for the note we added.

force_field.to_file("modified.offxml")
!grep funky modified.offxml
        <Angle smirks="[*:1]~[#7X3$(*~[#6X3,#6X2,#7X2+0]):2]~[*:3]" angle="179 * degree ** 1" k="151.142556131 * mole ** -1 * radian ** -2 * kilocalorie ** 1" id="a20" note="funky pseudo-linear cyclic nitrogen"></Angle>
        <Angle smirks="C1C[C:1]-[O:2]-[C:3]C1" angle="179 * degree ** 1" k="134.5019777341 * kilocalorie ** 1 * mole ** -1 * radian ** -2" note="funky pseudo-linear cyclic oxygen"></Angle>
/home/runner/micromamba/envs/openff-docs-examples/lib/python3.10/pty.py:89: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  pid, fd = os.forkpty()