Mixture simulation with Sage: Ethanol and cyclohexane

This example shows how to use the Open Force Field Toolkit to create a parametrized System object that can be used to run a molecular dynamic simulation with OpenMM. If you want to run MD with a different engine, see the example in examples/conversion_amber_gromacs/.

Create an Interchange

We start by loading a PDB file containing one copy of ethanol and cyclohexane. Our goal is to create an OFF Topology object describing this system that we can parametrize with the SMIRNOFF-format “Sage” force field. The parametrized system is stored in an Interchange object, which can then produce input files for a number of simulation engines.

The two Molecule objects created from the SMILES strings can contain information such as partial charges and stereochemistry that is not included in an OpenMM topology. In this example, partial charges are not explicitly given, and ForceField will assign AM1/BCC charges as specified by the “Sage” force field.

from openff.toolkit.utils import get_data_file_path

We use the get_data_file_path utility function to easily access the data installed with the toolkit. Here you have the option to load example systems of increasing complexity. For speed, we recommend that you begin by loading a system with a single ethanol and a single cyclohexane.

# 1 molecule of ethanol and 1 of cyclohexane.
pdb_file_path = get_data_file_path("systems/test_systems/1_cyclohexane_1_ethanol.pdb")

# 40%-60% cyclohexane-ethanol mixture.
# pdb_file_path = get_data_file_path('systems/packmol_boxes/cyclohexane_ethanol_0.4_0.6.pdb')

PDB files are not a reliable source of bond orders, so the toolkit requires users to supply a more detailed description of the molecule and its connectivity. SMILES records, SDF, and MOL2 files are among the many ways to create a Molecule.

from openff.toolkit import Molecule

ethanol = Molecule.from_smiles("CCO")
cyclohexane = Molecule.from_smiles("C1CCCCC1")

Alternatively, if you have sdf files of the molecules, or if you have OpenEye installed and mol2 files available, you can get the same results as above by loading the detailed molecule information from the files.

from openff.toolkit.utils.toolkits import OpenEyeToolkitWrapper

if OpenEyeToolkitWrapper.is_available():
    ethanol = Molecule.from_file(get_data_file_path("molecules/ethanol.mol2"))
    cyclohexane = Molecule.from_file(get_data_file_path("molecules/cyclohexane.mol2"))

We now create the Open Force Field Toolkit Topology describing the system from an OpenMM Topology object. The OFF Topology include more information (supplied by the two Molecule objects) than the OpenMM Topology such as (optionally) partial charges and stereochemistry. In this example, partial charges are not explicitly given, and ForceField will assign AM1/BCC charges as specified by the “Sage” force field.

Note that the Open Force Field Toolkit produces deterministic charges that do not depend on the input conformation of parameterized molecules. See the FAQ for more information.

Note on partial charges: The full 1.0.0 release will implement support for the definition of semiempirical partial charge treatment directly into the SMIRNOFF force field file (for details, see https://openforcefield.github.io/standards/standards/smirnoff/#partial-charge-and-electrostatics-models). Moreover, it will be possible to import charges directly from sdf and mol2 files.
from openff.toolkit import ForceField, Topology

# Create the OpenFF Topology from an PDB file
topology = Topology.from_pdb(
    unique_molecules=[ethanol, cyclohexane],

# Load the OpenFF "Sage" force field.
forcefield = ForceField("openff-2.1.0.offxml")

# Parametrize the topology and create an Interchange object.
interchange = forcefield.create_interchange(topology)

Run a simulation

We can now use the Interchange object to prepare and run molecular dynamics simulations with OpenMM.

import openmm
from openmm import unit

# Propagate the System with Langevin dynamics.
time_step = 2 * unit.femtoseconds  # simulation timestep
temperature = 300 * unit.kelvin  # simulation temperature
friction = 1 / unit.picosecond  # collision rate
integrator = openmm.LangevinIntegrator(temperature, friction, time_step)

# Length of the simulation.
num_steps = 1000  # number of integration steps to run

# Logging options.
trj_freq = 100  # number of steps per written trajectory frame
data_freq = 100  # number of steps per written simulation statistics

# Set up an OpenMM simulation.
simulation = interchange.to_openmm_simulation(integrator)

# Randomize the velocities from a Boltzmann distribution at a given temperature.

# Configure the information in the output files.
pdb_reporter = openmm.app.PDBReporter("trajectory.pdb", trj_freq)
state_data_reporter = openmm.app.StateDataReporter(
import time

print("Starting simulation")
start = time.process_time()

# Run the simulation

end = time.process_time()
print("Elapsed time %.2f seconds" % (end - start))
Starting simulation
Elapsed time 2.02 seconds

If successful, the directory where your jupyter notebook is running should contain a trajectory.pdb file that you can visualize and a data.csv file including potential energy, density, and temperature of each frame.

!cat data.csv
#"Step","Potential Energy (kJ/mole)","Temperature (K)","Density (g/mL)"
/home/runner/micromamba/envs/openff-docs-examples/lib/python3.10/pty.py:89: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  pid, fd = os.forkpty()