openff.toolkit.topology.Molecule

class openff.toolkit.topology.Molecule(*args, **kwargs)[source]

Mutable chemical representation of a molecule, such as a small molecule or biopolymer.

Examples

Create a molecule from an sdf file

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule

>>> molecule = Molecule.from_openeye(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule

>>> molecule = Molecule.from_rdkit(rdmol)

Create a molecule from IUPAC name (requires the OpenEye toolkit)

>>> molecule = Molecule.from_iupac('imatinib')

Create a molecule from SMILES

>>> molecule = Molecule.from_smiles('Cc1ccccc1')

Warning

This API is experimental and subject to change.

__init__(*args, **kwargs)[source]

Create a new Molecule object

Parameters

other (optional, default=None) –

If specified, attempt to construct a copy of the molecule from the specified object. This can be any one of the following:

  • a Molecule object

  • a file that can be used to construct a Molecule object

  • an openeye.oechem.OEMol

  • an rdkit.Chem.rdchem.Mol

  • a serialized Molecule object

Examples

Create an empty molecule:

>>> empty_molecule = Molecule()

Create a molecule from a file that can be used to construct a molecule, using either a filename or file-like object:

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> molecule = Molecule(open(sdf_filepath, 'r'), file_format='sdf')
>>> import gzip
>>> mol2_gz_filepath = get_data_file_path('molecules/toluene.mol2.gz')
>>> molecule = Molecule(gzip.GzipFile(mol2_gz_filepath, 'r'), file_format='mol2')

Create a molecule from another molecule:

>>> molecule_copy = Molecule(molecule)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule:

>>> molecule = Molecule(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule:

>>> molecule = Molecule(rdmol)

Convert the molecule into a dictionary and back again:

>>> serialized_molecule = molecule.to_dict()
>>> molecule_copy = Molecule(serialized_molecule)

Methods

__init__(*args, **kwargs)

Create a new Molecule object

add_atom(atomic_number, formal_charge, ...)

Add an atom to the molecule.

add_bond(atom1, atom2, bond_order, is_aromatic)

Add a bond between two specified atom indices

add_conformer(coordinates)

Add a conformation of the molecule

add_default_hierarchy_schemes([...])

Adds chain and residue hierarchy schemes.

add_hierarchy_scheme(uniqueness_criteria, ...)

Use the molecule's metadata to facilitate iteration over its atoms.

apply_elf_conformer_selection([percentage, ...])

Select a set of diverse conformers from the molecule's conformers with ELF.

are_isomorphic(mol1, mol2[, ...])

Determine if mol1 is isomorphic to mol2.

assign_fractional_bond_orders([...])

Update and store list of bond orders this molecule.

assign_partial_charges(partial_charge_method)

Calculate partial atomic charges and store them in the molecule.

atom(index)

Get the atom with the specified index.

atom_index(atom)

Returns the index of the given atom in this molecule

bond(index)

Get the bond with the specified index.

canonical_order_atoms([toolkit_registry])

Produce a copy of the molecule with the atoms reordered canonically.

chemical_environment_matches(query[, ...])

Find matches in the molecule for a SMARTS string or ChemicalEnvironment query

compute_partial_charges_am1bcc([...])

Deprecated since version 0.11.0.

delete_hierarchy_scheme(iter_name)

Remove an existing HierarchyScheme specified by its iterator name.

enumerate_protomers([max_states])

Enumerate the formal charges of a molecule to generate different protomoers.

enumerate_stereoisomers([undefined_only, ...])

Enumerate the stereocenters and bonds of the current molecule.

enumerate_tautomers([max_states, ...])

Enumerate the possible tautomers of the current molecule

find_rotatable_bonds([...])

Find all bonds classed as rotatable ignoring any matched to the ignore_functional_groups list.

from_bson(serialized)

Instantiate an object from a BSON serialized representation.

from_dict(molecule_dict)

Create a new Molecule from a dictionary representation

from_file(file_path[, file_format, ...])

Create one or more molecules from a file

from_inchi(inchi[, allow_undefined_stereo, ...])

Construct a Molecule from a InChI representation

from_iupac(iupac_name[, toolkit_registry, ...])

Generate a molecule from IUPAC or common name

from_json(serialized)

Instantiate an object from a JSON serialized representation.

from_mapped_smiles(mapped_smiles[, ...])

Create an Molecule from a mapped SMILES made with cmiles.

from_messagepack(serialized)

Instantiate an object from a MessagePack serialized representation.

from_openeye(oemol[, allow_undefined_stereo])

Create a Molecule from an OpenEye molecule.

from_pdb(file_path[, toolkit_registry])

Deprecated since version 0.11.0.

from_pdb_and_smiles(file_path, smiles[, ...])

Create a Molecule from a pdb file and a SMILES string using RDKit.

from_pickle(serialized)

Instantiate an object from a pickle serialized representation.

from_polymer_pdb(file_path[, toolkit_registry])

Loads a polymer from a PDB file.

from_qcschema(qca_record[, client, ...])

Create a Molecule from a QCArchive molecule record or dataset entry based on attached cmiles information.

from_rdkit(rdmol[, allow_undefined_stereo, ...])

Create a Molecule from an RDKit molecule.

from_smiles(smiles[, ...])

Construct a Molecule from a SMILES representation

from_toml(serialized)

Instantiate an object from a TOML serialized representation.

from_topology(topology)

Return a Molecule representation of an OpenFF Topology containing a single Molecule object.

from_xml(serialized)

Instantiate an object from an XML serialized representation.

from_yaml(serialized)

Instantiate from a YAML serialized representation.

generate_conformers([toolkit_registry, ...])

Generate conformers for this molecule using an underlying toolkit.

generate_unique_atom_names()

Generate unique atom names using element name and number of times that element has occurred e.g.

get_bond_between(i, j)

Returns the bond between two atoms

is_isomorphic_with(other, **kwargs)

Check if the molecule is isomorphic with the other molecule which can be an openff.toolkit.topology.Molecule or nx.Graph().

nth_degree_neighbors(n_degrees)

Return canonicalized pairs of atoms whose shortest separation is exactly n bonds.

ordered_connection_table_hash()

Compute an ordered hash of the atoms and bonds in the molecule

particle(index)

DEPRECATED: Use Molecule.atom instead.

particle_index(particle)

DEPRECATED: Use Molecule.atom_index instead.

perceive_residues([substructure_file_path, ...])

Perceive a polymer's residues and permit iterating over them.

remap(mapping_dict[, current_to_new])

Remap all of the indexes in the molecule to match the given mapping dict

strip_atom_stereochemistry(smarts[, ...])

Delete stereochemistry information for certain atoms, if it is present.

to_bson()

Return a BSON serialized representation.

to_dict()

Return a dictionary representation of the molecule.

to_file(file_path, file_format[, ...])

Write the current molecule to a file or file-like object

to_hill_formula()

Generate the Hill formula of this molecule.

to_inchi([fixed_hydrogens, toolkit_registry])

Create an InChI string for the molecule using the requested toolkit backend.

to_inchikey([fixed_hydrogens, toolkit_registry])

Create an InChIKey for the molecule using the requested toolkit backend.

to_iupac([toolkit_registry])

Generate IUPAC name from Molecule

to_json([indent])

Return a JSON serialized representation.

to_messagepack()

Return a MessagePack representation.

to_networkx()

Generate a NetworkX undirected graph from the molecule.

to_openeye([toolkit_registry, aromaticity_model])

Create an OpenEye molecule

to_pickle()

Return a pickle serialized representation.

to_qcschema([multiplicity, conformer, extras])

Create a QCElemental Molecule.

to_rdkit([aromaticity_model, toolkit_registry])

Create an RDKit molecule

to_smiles([isomeric, explicit_hydrogens, ...])

Return a canonical isomeric SMILES representation of the current molecule.

to_toml()

Return a TOML serialized representation.

to_topology()

Return an OpenFF Topology representation containing one copy of this molecule

to_xml([indent])

Return an XML representation.

to_yaml()

Return a YAML serialized representation.

update_hierarchy_schemes([iter_names])

Infer a hierarchy from atom metadata according to the existing hierarchy schemes.

visualize([backend, width, height, ...])

Render a visualization of the molecule in Jupyter

Attributes

amber_impropers

Iterate over all impropers with trivalent centers, reporting the central atom first.

angles

Get an iterator over all i-j-k angles.

atoms

Iterate over all Atom objects in the molecule.

bonds

Iterate over all Bond objects in the molecule.

conformers

Returns the list of conformers for this molecule.

has_unique_atom_names

True if the molecule has unique atom names, False otherwise.

hierarchy_schemes

The hierarchy schemes available on the molecule.

hill_formula

Get the Hill formula of the molecule

impropers

Iterate over all improper torsions in the molecule.

n_angles

Number of angles in the molecule.

n_atoms

The number of Atom objects.

n_bonds

The number of Bond objects in the molecule.

n_conformers

The number of conformers for this molecule.

n_impropers

Number of possible improper torsions in the molecule.

n_particles

Use Molecule.n_atoms instead.

n_propers

Number of proper torsions in the molecule.

name

The name (or title) of the molecule

partial_charges

Returns the partial charges (if present) on the molecule.

particles

Use Molecule.atoms instead.

propers

Iterate over all proper torsions in the molecule

properties

The properties dictionary of the molecule

smirnoff_impropers

Iterate over all impropers with trivalent centers, reporting the central atom second.

torsions

Get an iterator over all i-j-k-l torsions.

total_charge

Return the total charge on the molecule

add_atom(atomic_number, formal_charge, is_aromatic, stereochemistry=None, name=None, metadata=None)[source]

Add an atom to the molecule.

Parameters
  • atomic_number (int) – Atomic number of the atom

  • formal_charge (int) – Formal charge of the atom

  • is_aromatic (bool) – If True, atom is aromatic; if False, not aromatic

  • stereochemistry (str, optional, default=None) – Either 'R' or 'S' for specified stereochemistry, or None if stereochemistry is irrelevant

  • name (str, optional) – An optional name for the atom

  • metadata (dict[str: (int, str)], default=None) – An optional dictionary where keys are strings and values are strings or ints. This is intended to record atom-level information used to inform hierarchy definition and iteration, such as grouping atom by residue and chain.

Returns

index (int) – The index of the atom in the molecule

Examples

Define a methane molecule

>>> molecule = Molecule()
>>> molecule.name = 'methane'
>>> C = molecule.add_atom(6, 0, False)
>>> H1 = molecule.add_atom(1, 0, False)
>>> H2 = molecule.add_atom(1, 0, False)
>>> H3 = molecule.add_atom(1, 0, False)
>>> H4 = molecule.add_atom(1, 0, False)
>>> bond_idx = molecule.add_bond(C, H1, False, 1)
>>> bond_idx = molecule.add_bond(C, H2, False, 1)
>>> bond_idx = molecule.add_bond(C, H3, False, 1)
>>> bond_idx = molecule.add_bond(C, H4, False, 1)
add_bond(atom1, atom2, bond_order, is_aromatic, stereochemistry=None, fractional_bond_order=None)[source]

Add a bond between two specified atom indices

Parameters
  • atom1 (int or Atom) – Index of first atom

  • atom2 (int or Atom) – Index of second atom

  • bond_order (int) – Integral bond order of Kekulized form

  • is_aromatic (bool) – True if this bond is aromatic, False otherwise

  • stereochemistry (str, optional, default=None) – Either 'E' or 'Z' for specified stereochemistry, or None if stereochemistry is irrelevant

  • fractional_bond_order (float, optional, default=None) – The fractional (eg. Wiberg) bond order

Returns

index (int) – Index of the bond in this molecule

add_conformer(coordinates)[source]

Add a conformation of the molecule

Parameters

coordinates (unit-wrapped np.array with shape (n_atoms, 3) and dimension of distance) – Coordinates of the new conformer, with the first dimension of the array corresponding to the atom index in the molecule’s indexing system.

Returns

index (int) – The index of this conformer

visualize(backend='rdkit', width=None, height=None, show_all_hydrogens=True)[source]

Render a visualization of the molecule in Jupyter

Parameters
  • backend (str, optional, default='rdkit') –

    The visualization engine to use. Choose from:

    • "rdkit"

    • "openeye"

    • "nglview" (requires conformers)

  • width (int, optional, default=500) – Width of the generated representation (only applicable to backend=openeye or backend=rdkit)

  • height (int, optional, default=300) – Width of the generated representation (only applicable to backend=openeye or backend=rdkit)

  • show_all_hydrogens (bool, optional, default=True) – Whether to explicitly depict all hydrogen atoms. (only applicable to backend=openeye or backend=rdkit)

Returns

object – Depending on the backend chosen:

  • rdkit → IPython.display.SVG

  • openeye → IPython.display.Image

  • nglview → nglview.NGLWidget

perceive_residues(substructure_file_path=None, strict_chirality=True)[source]

Perceive a polymer’s residues and permit iterating over them.

Perceives residues by matching substructures in the current molecule with a substructure dictionary file, using SMARTS, and assigns residue names and numbers to atom metadata. It then constructs a residue hierarchy scheme to allow iterating over residues.

Parameters
  • substructure_file_path (str, optional, default=None) – Path to substructure library file in JSON format. Defaults to using built-in substructure file.

  • strict_chirality (bool, optional, default=True) – Whether to use strict chirality symbols (stereomarks) for substructure matchings with SMARTS.

add_default_hierarchy_schemes(overwrite_existing=True)

Adds chain and residue hierarchy schemes.

The Open Force Field Toolkit has no native understanding of hierarchical atom organisation schemes common to other biomolecular software, such as “residues” or “chains” (see Hierarchy data (chains and residues)). Hierarchy schemes allow iteration over groups of atoms according to their metadata. For more information, see HierarchyScheme.

If a Molecule with the default hierarchy schemes changes, Molecule.update_hierarchy_schemes() must be called before the residues or chains are iterated over again or else the iteration may be incorrect.

Parameters

overwrite_existing (bool, default=True) – Whether to overwrite existing instances of the residue and chain hierarchy schemes. If this is False and either of the hierarchy schemes are already defined on this molecule, an exception will be raised.

Raises

HierarchySchemeWithIteratorNameAlreadyRegisteredException – When overwrite_existing=False and either the chains or residues hierarchy scheme is already configured.

add_hierarchy_scheme(uniqueness_criteria, iterator_name)

Use the molecule’s metadata to facilitate iteration over its atoms.

This method will add an attribute with the name given by the iterator_name argument that provides an iterator over groups of atoms. Atoms are grouped by the values in their atom.metadata dictionary; any atoms with the same values for the keys given in the uniqueness_criteria argument will be in the same group. These groups have the type HierarchyElement.

Hierarchy schemes are not updated dynamically; if a Molecule with hierarchy schemes changes, Molecule.update_hierarchy_schemes() must be called before the scheme is iterated over again or else the grouping may be incorrect.

Hierarchy schemes allow iteration over groups of atoms according to their metadata. For more information, see HierarchyScheme.

Parameters
  • uniqueness_criteria (tuple of str) – The names of Atom metadata entries that define this scheme. An atom belongs to a HierarchyElement only if its metadata has the same values for these criteria as the other atoms in the HierarchyElement.

  • iterator_name (str) – Name of the iterator that will be exposed to access the hierarchy elements generated by this scheme.

Returns

new_hier_scheme (openff.toolkit.topology.HierarchyScheme) – The newly created HierarchyScheme

property amber_impropers: Set[Tuple[Atom, Atom, Atom, Atom]]

Iterate over all impropers with trivalent centers, reporting the central atom first.

The central atom is reported first in each torsion. This method reports an improper for each trivalent atom in the molecule, whether or not any given force field would assign it improper torsion parameters.

Also note that this will return 6 possible atom orderings around each improper center. In current AMBER parameterization, one of these six orderings will be used for the actual assignment of the improper term and measurement of the angle. This method does not encode the logic to determine which of the six orderings AMBER would use.

Returns

impropers (set of tuple) – An iterator of tuples, each containing the indices of atoms making up a possible improper torsion. The central atom is listed first in each tuple.

property angles: Set[Tuple[Atom, Atom, Atom]]

Get an iterator over all i-j-k angles.

apply_elf_conformer_selection(percentage: float = 2.0, limit: int = 10, toolkit_registry: Optional[Union[ToolkitRegistry, ToolkitWrapper]] = GLOBAL_TOOLKIT_REGISTRY, **kwargs)

Select a set of diverse conformers from the molecule’s conformers with ELF.

Applies the Electrostatically Least-interacting Functional groups method to select a set of diverse conformers which have minimal electrostatically strongly interacting functional groups from the molecule’s conformers.

Parameters
  • toolkit_registry – The underlying toolkit to use to select the ELF conformers.

  • percentage – The percentage of conformers with the lowest electrostatic interaction energies to greedily select from.

  • limit – The maximum number of conformers to select.

Notes

  • The input molecule should have a large set of conformers already generated to select the ELF conformers from.

  • The selected conformers will be retained in the conformers list while unselected conformers will be discarded.

static are_isomorphic(mol1, mol2, return_atom_map=False, aromatic_matching=True, formal_charge_matching=True, bond_order_matching=True, atom_stereochemistry_matching=True, bond_stereochemistry_matching=True, strip_pyrimidal_n_atom_stereo=True, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Determine if mol1 is isomorphic to mol2.

are_isomorphic() compares two molecule’s graph representations and the chosen node/edge attributes. Connections and atomic numbers are always checked.

If nx.Graphs() are given they must at least have atomic_number attributes on nodes. Other attributes that are_isomorphic() can optionally check…

  • … in nodes are:

    • is_aromatic

    • formal_charge

    • stereochemistry

  • … in edges are:

    • is_aromatic

    • bond_order

    • stereochemistry

By default, all attributes are checked, but stereochemistry around pyrimidal nitrogen is ignored.

Warning

This API is experimental and subject to change.

Parameters
  • mol1 (an openff.toolkit.topology.molecule.FrozenMolecule or nx.Graph()) – The first molecule to test for isomorphism.

  • mol2 (an openff.toolkit.topology.molecule.FrozenMolecule or nx.Graph()) – The second molecule to test for isomorphism.

  • return_atom_map (bool, default=False, optional) – Return a dict containing the atomic mapping instead of a bool.

  • aromatic_matching (bool, default=True, optional) – If False, aromaticity of graph nodes and edges are ignored for the purpose of determining isomorphism.

  • formal_charge_matching (bool, default=True, optional) – If False, formal charges of graph nodes are ignored for the purpose of determining isomorphism.

  • bond_order_matching (bool, default=True, optional) – If False, bond orders of graph edges are ignored for the purpose of determining isomorphism.

  • atom_stereochemistry_matching (bool, default=True, optional) – If False, atoms’ stereochemistry is ignored for the purpose of determining isomorphism.

  • bond_stereochemistry_matching (bool, default=True, optional) – If False, bonds’ stereochemistry is ignored for the purpose of determining isomorphism.

  • strip_pyrimidal_n_atom_stereo (bool, default=True, optional) – If True, any stereochemistry defined around pyrimidal nitrogen stereocenters will be disregarded in the isomorphism check.

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for removing stereochemistry from pyrimidal nitrogens.

Returns

  • molecules_are_isomorphic (bool)

  • atom_map (default=None, Optional,) – [Dict[int,int]] ordered by mol1 indexing {mol1_index: mol2_index} If molecules are not isomorphic given input arguments, will return None instead of dict.

assign_fractional_bond_orders(bond_order_model=None, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY, use_conformers=None)

Update and store list of bond orders this molecule.

Bond orders are stored on each bond, in the bond.fractional_bond_order attribute.

Warning

This API is experimental and subject to change.

Parameters
  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

  • bond_order_model (string, optional. Default=None) – The bond order model to use for fractional bond order calculation. If None, "am1-wiberg" is used.

  • use_conformers (iterable of openmm.unit.Quantity(np.array) with shape (n_atoms, 3) and dimension of distance,) – optional, default=None The conformers to use for fractional bond order calculation. If None, an appropriate number of conformers will be generated by an available ToolkitWrapper.

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.assign_fractional_bond_orders()
Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

assign_partial_charges(partial_charge_method: str, strict_n_conformers=False, use_conformers=None, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY, normalize_partial_charges=True)

Calculate partial atomic charges and store them in the molecule.

assign_partial_charges computes charges using the specified toolkit and assigns the new values to the partial_charges attribute. Supported charge methods vary from toolkit to toolkit, but some supported methods are:

  • "am1bcc"

  • "am1bccelf10" (requires OpenEye Toolkits)

  • "am1-mulliken"

  • "mmff94"

  • "gasteiger"

For more supported charge methods and details, see the corresponding methods in each toolkit wrapper:

Parameters
  • partial_charge_method (string) – The partial charge calculation method to use for partial charge calculation.

  • strict_n_conformers (bool, default=False) – Whether to raise an exception if an invalid number of conformers is provided for the given charge method. If this is False and an invalid number of conformers is found, a warning will be raised.

  • use_conformers (iterable of openmm.unit.Quantity-wrapped numpy arrays, each with shape (n_atoms, 3) and) – dimension of distance. Optional, default=None Coordinates to use for partial charge calculation. If None, an appropriate number of conformers will be generated.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for the calculation.

  • normalize_partial_charges (bool, default=True) – Whether to offset partial charges so that they sum to the total formal charge of the molecule. This is used to prevent accumulation of rounding errors when the partial charge assignment method returns values at limited precision.

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.assign_partial_charges('am1-mulliken')
Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

atom(index: int) Atom

Get the atom with the specified index.

Parameters

index (int) –

Returns

atom (openff.toolkit.topology.Atom)

atom_index(atom: Atom) int

Returns the index of the given atom in this molecule

Parameters

atom (Atom) –

Returns

index (int) – The index of the given atom in this molecule

property atoms

Iterate over all Atom objects in the molecule.

bond(index: int) Bond

Get the bond with the specified index.

Parameters

index (int) –

Returns

bond (openff.toolkit.topology.Bond)

property bonds: List[Bond]

Iterate over all Bond objects in the molecule.

canonical_order_atoms(toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Produce a copy of the molecule with the atoms reordered canonically.

Each toolkit defines its own canonical ordering of atoms. The canonical order may change from toolkit version to toolkit version or between toolkits.

Warning

This API is experimental and subject to change.

Parameters

toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

Returns

molecule (openff.toolkit.topology.Molecule) – An new OpenFF style molecule with atoms in the canonical order.

chemical_environment_matches(query, unique=False, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Find matches in the molecule for a SMARTS string or ChemicalEnvironment query

Parameters
  • query (str or ChemicalEnvironment) – SMARTS string (with one or more tagged atoms) or ChemicalEnvironment query. Query will internally be resolved to SMIRKS using query.asSMIRKS() if it has an .asSMIRKS method.

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use for chemical environment matches

Returns

matches (list of atom index tuples) – A list of tuples, containing the indices of the matching atoms.

Examples

Retrieve all the carbon-carbon bond matches in a molecule

>>> molecule = Molecule.from_iupac('imatinib')
>>> matches = molecule.chemical_environment_matches('[#6X3:1]~[#6X3:2]')
compute_partial_charges_am1bcc(use_conformers=None, strict_n_conformers=False, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Deprecated since version 0.11.0: This method was deprecated in v0.11.0 and will soon be removed. Use assign_partial_charges(partial_charge_method='am1bcc') instead.

Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit and assign them to this molecule’s partial_charges attribute.

Parameters
  • strict_n_conformers (bool, default=False) – Whether to raise an exception if an invalid number of conformers is provided for the given charge method. If this is False and an invalid number of conformers is found, a warning will be raised.

  • use_conformers (iterable of openmm.unit.Quantity-wrapped numpy arrays, each with shape (n_atoms, 3)) – and dimension of distance. Optional, default=None Coordinates to use for partial charge calculation. If None, an appropriate number of conformers for the given charge method will be generated.

  • toolkit_registry (ToolkitRegistry) –

  • openff.toolkit.utils.toolkits.ToolkitWrapper (or) – ToolkitRegistry or ToolkitWrapper to use for the calculation

  • optionalToolkitRegistry or ToolkitWrapper to use for the calculation

  • default=NoneToolkitRegistry or ToolkitWrapper to use for the calculation

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
>>> molecule.compute_partial_charges_am1bcc()
Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

property conformers

Returns the list of conformers for this molecule.

Conformers are presented as a list of Quantity-wrapped NumPy arrays, of shape (3 x n_atoms) and with dimensions of [Distance]. The return value is the actual list of conformers, and changes to the contents affect the original FrozenMolecule.

delete_hierarchy_scheme(iter_name)

Remove an existing HierarchyScheme specified by its iterator name.

Hierarchy schemes allow iteration over groups of atoms according to their metadata. For more information, see HierarchyScheme.

Parameters

iter_name (str) –

enumerate_protomers(max_states=10)

Enumerate the formal charges of a molecule to generate different protomoers.

Parameters

max_states (int optional, default=10,) – The maximum number of protomer states to be returned.

Returns

molecules (List[openff.toolkit.topology.Molecule],) – A list of the protomers of the input molecules not including the input.

enumerate_stereoisomers(undefined_only=False, max_isomers=20, rationalise=True, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Enumerate the stereocenters and bonds of the current molecule.

Parameters
  • undefined_only (bool optional, default=False) – If we should enumerate all stereocenters and bonds or only those with undefined stereochemistry

  • max_isomers (int optional, default=20) – The maximum amount of molecules that should be returned

  • rationalise (bool optional, default=True) – If we should try to build and rationalise the molecule to ensure it can exist

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – lopenff.toolkit.utils.toolkits.ToolkitWrapper, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use to enumerate the stereoisomers.

Returns

molecules (List[openff.toolkit.topology.Molecule]) – A list of Molecule instances not including the input molecule.

enumerate_tautomers(max_states=20, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Enumerate the possible tautomers of the current molecule

Parameters
  • max_states (int optional, default=20) – The maximum amount of molecules that should be returned

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use to enumerate the tautomers.

Returns

molecules (List[openff.toolkit.topology.Molecule]) – A list of openff.toolkit.topology.Molecule instances not including the input molecule.

find_rotatable_bonds(ignore_functional_groups=None, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Find all bonds classed as rotatable ignoring any matched to the ignore_functional_groups list.

Parameters
  • ignore_functional_groups (optional, List[str], default=None,) – A list of bond SMARTS patterns to be ignored when finding rotatable bonds.

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapperl, optional, default=None ToolkitRegistry or ToolkitWrapper to use for SMARTS matching

Returns

bonds (List[openff.toolkit.topology.molecule.Bond]) – The list of openff.toolkit.topology.molecule.Bond instances which are rotatable.

classmethod from_bson(serialized)

Instantiate an object from a BSON serialized representation.

Specification: http://bsonspec.org/

Parameters

serialized (bytes) – A BSON serialized representation of the object

Returns

instance (cls) – An instantiated object

classmethod from_dict(molecule_dict)

Create a new Molecule from a dictionary representation

Parameters

molecule_dict (OrderedDict) – A dictionary representation of the molecule.

Returns

molecule (Molecule) – A Molecule created from the dictionary representation

classmethod from_file(file_path, file_format=None, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>, allow_undefined_stereo=False)

Create one or more molecules from a file

Parameters
  • file_path (str or file-like object) – The path to the file or file-like object to stream one or more molecules from.

  • file_format (str, optional, default=None) – Format specifier, usually file suffix (eg. ‘MOL2’, ‘SMI’) Note that not all toolkits support all formats. Check ToolkitWrapper.toolkit_file_read_formats for your loaded toolkits for details.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use for file loading. If a Toolkit is passed, only the highest-precedence toolkit is used

  • allow_undefined_stereo (bool, default=False) – If false, raises an exception if oemol contains undefined stereochemistry.

Returns

molecules (Molecule or list of Molecules) – If there is a single molecule in the file, a Molecule is returned; otherwise, a list of Molecule objects is returned.

Examples

>>> from openff.toolkit.tests.utils import get_monomer_mol2_file_path
>>> mol2_file_path = get_monomer_mol2_file_path('cyclohexane')
>>> molecule = Molecule.from_file(mol2_file_path)
classmethod from_inchi(inchi, allow_undefined_stereo=False, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>)

Construct a Molecule from a InChI representation

Parameters
  • inchi (str) – The InChI representation of the molecule.

  • allow_undefined_stereo (bool, default=False) – Whether to accept InChI with undefined stereochemistry. If False, an exception will be raised if a InChI with undefined stereochemistry is passed into this function.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for InChI-to-molecule conversion

Returns

molecule (openff.toolkit.topology.Molecule)

Examples

Make cis-1,2-Dichloroethene:

>>> molecule = Molecule.from_inchi('InChI=1S/C2H2Cl2/c3-1-2-4/h1-2H/b2-1-')
classmethod from_iupac(iupac_name, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>, allow_undefined_stereo=False, **kwargs)

Generate a molecule from IUPAC or common name

Note

This method requires the OpenEye toolkit to be installed.

Parameters
  • iupac_name (str) – IUPAC name of molecule to be generated

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use for chemical environment matches

  • allow_undefined_stereo (bool, default=False) – If false, raises an exception if molecule contains undefined stereochemistry.

Returns

molecule (Molecule) – The resulting molecule with position

Examples

Create a molecule from an IUPAC name

>>> molecule = Molecule.from_iupac('4-[(4-methylpiperazin-1-yl)methyl]-N-(4-methyl-3-{[4-(pyridin-3-yl)pyrimidin-2-yl]amino}phenyl)benzamide')  # noqa

Create a molecule from a common name

>>> molecule = Molecule.from_iupac('imatinib')
classmethod from_json(serialized: str)

Instantiate an object from a JSON serialized representation.

Specification: https://www.json.org/

Parameters

serialized (str) – A JSON serialized representation of the object

Returns

instance (cls) – An instantiated object

classmethod from_mapped_smiles(mapped_smiles, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>, allow_undefined_stereo=False)

Create an Molecule from a mapped SMILES made with cmiles. The molecule will be in the order of the indexing in the mapped smiles string.

Warning

This API is experimental and subject to change.

Parameters
  • mapped_smiles (str) – A CMILES-style mapped smiles string with explicit hydrogens.

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

  • allow_undefined_stereo (bool, default=False) – If false, raises an exception if oemol contains undefined stereochemistry.

Returns

offmol (openff.toolkit.topology.molecule.Molecule) – An OpenFF molecule instance.

Raises

SmilesParsingError – If the given SMILES had no indexing picked up by the toolkits.

classmethod from_messagepack(serialized)

Instantiate an object from a MessagePack serialized representation.

Specification: https://msgpack.org/index.html

Parameters

serialized (bytes) – A MessagePack-encoded bytes serialized representation

Returns

instance (cls) – Instantiated object.

classmethod from_openeye(oemol, allow_undefined_stereo=False)

Create a Molecule from an OpenEye molecule.

Requires the OpenEye toolkit to be installed.

Parameters
  • oemol (openeye.oechem.OEMol) – An OpenEye molecule

  • allow_undefined_stereo (bool, default=False) – If False, raises an exception if oemol contains undefined stereochemistry.

Returns

molecule (openff.toolkit.topology.Molecule) – An OpenFF molecule

Examples

Create a Molecule from an OpenEye OEMol

>>> from openeye import oechem
>>> from openff.toolkit.tests.utils import get_data_file_path
>>> ifs = oechem.oemolistream(get_data_file_path('systems/monomers/ethanol.mol2'))
>>> oemols = list(ifs.GetOEGraphMols())
>>> molecule = Molecule.from_openeye(oemols[0])
classmethod from_pdb(file_path, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>)

Deprecated since version 0.11.0: from_pdb is deprecated and will soon be removed. Use from_polymer_pdb() instead.

classmethod from_pdb_and_smiles(file_path, smiles, allow_undefined_stereo=False)

Create a Molecule from a pdb file and a SMILES string using RDKit.

Requires RDKit to be installed.

Warning

This API is experimental and subject to change.

The molecule is created and sanitised based on the SMILES string, we then find a mapping between this molecule and one from the PDB based only on atomic number and connections. The SMILES molecule is then reindexed to match the PDB, the conformer is attached, and the molecule returned.

Note that any stereochemistry in the molecule is set by the SMILES, and not the coordinates of the PDB.

Parameters
  • file_path (str) – PDB file path

  • smiles (str) – a valid smiles string for the pdb, used for stereochemistry, formal charges, and bond order

  • allow_undefined_stereo (bool, default=False) – If false, raises an exception if SMILES contains undefined stereochemistry.

Returns

molecule (openff.toolkit.Molecule) – An OFFMol instance with ordering the same as used in the PDB file.

Raises

InvalidConformerError – If the SMILES and PDB molecules are not isomorphic.

classmethod from_pickle(serialized)

Instantiate an object from a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Parameters

serialized (str) – A pickled representation of the object

Returns

instance (cls) – An instantiated object

classmethod from_polymer_pdb(file_path: ~typing.Union[str, ~typing.TextIO], toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>)

Loads a polymer from a PDB file.

Currently only supports proteins with canonical amino acids that are either uncapped or capped by ACE/NME groups, but may later be extended to handle other common polymers, or accept user-defined polymer templates.

Metadata such as residues, chains, and atom names are recorded in the Atom.properties attribute, which is a dictionary mapping from strings like “residue” to the appropriate value. from_polymer_pdb returns a molecule that can be iterated over with the .residues and .chains attributes, as well as the usual .atoms.

This method proceeds in the following order:

  • Loads the polymer substructure template file (distributed with the OpenFF Toolkit)

  • Loads the PDB into an OpenMM openmm.app.PDBFile object

  • Turns OpenMM topology into a temporarily invalid rdkit Molecule

  • Adds chemical information to the molecule:
    • For each substructure loaded from the substructure template file
      • Uses rdkit to find matches between the substructure and the molecule

      • For any matches, assigns the atom formal charge and bond order info from the substructure to the rdkit molecule, then marks the atoms and bonds as having been assigned so they can not be overwritten by subsequent isomorphisms

  • Take coordinates from the OpenMM Topology and add them as a conformer

  • Convert the rdkit Molecule to OpenFF

Parameters
  • file_path (str or file object) – PDB information to be passed to OpenMM PDBFile object for loading

  • None (toolkit_registry = ToolkitWrapper or ToolkitRegistry. Default =) – Either a ToolkitRegistry, ToolkitWrapper

Returns

molecule (openff.toolkit.topology.Molecule)

classmethod from_qcschema(qca_record, client=None, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>, allow_undefined_stereo=False)

Create a Molecule from a QCArchive molecule record or dataset entry based on attached cmiles information.

For a molecule record, a conformer will be set from its geometry.

For a dataset entry, if a corresponding client instance is provided, the starting geometry for that entry will be used as a conformer.

A QCElemental Molecule produced from Molecule.to_qcschema can be round-tripped through this method to produce a new, valid Molecule.

Parameters
  • qca_record (dict) – A QCArchive molecule record or dataset entry.

  • client (optional, default=None,) – A qcportal.FractalClient instance to use for fetching an initial geometry. Only used if qca_record is a dataset entry.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

  • allow_undefined_stereo (bool, default=False) – If false, raises an exception if qca_record contains undefined stereochemistry.

Returns

molecule (openff.toolkit.topology.Molecule) – An OpenFF molecule instance.

Examples

Get Molecule from a QCArchive molecule record:

>>> from qcportal import FractalClient
>>> client = FractalClient()
>>> offmol = Molecule.from_qcschema(client.query_molecules(molecular_formula="C16H20N3O5")[0])

Get Molecule from a QCArchive optimization entry:

>>> from qcportal import FractalClient
>>> client = FractalClient()
>>> optds = client.get_collection("OptimizationDataset",
                                  "SMIRNOFF Coverage Set 1")
>>> offmol = Molecule.from_qcschema(optds.get_entry('coc(o)oc-0'))

Same as above, but with conformer(s) from initial molecule(s) by providing client to database:

>>> offmol = Molecule.from_qcschema(optds.get_entry('coc(o)oc-0'), client=client)
Raises
  • AttributeError

    • If the record dict can not be made from qca_record. - If a client is passed and it could not retrieve the initial molecule.

  • KeyError – If the dict does not contain the canonical_isomeric_explicit_hydrogen_mapped_smiles.

  • InvalidConformerError – Silent error, if the conformer could not be attached.

classmethod from_rdkit(rdmol, allow_undefined_stereo=False, hydrogens_are_explicit=False)

Create a Molecule from an RDKit molecule.

Requires the RDKit to be installed.

Parameters
  • rdmol (rkit.RDMol) – An RDKit molecule

  • allow_undefined_stereo (bool, default=False) – If False, raises an exception if rdmol contains undefined stereochemistry.

  • hydrogens_are_explicit (bool, default=False) – If False, RDKit will perform hydrogen addition using Chem.AddHs

Returns

molecule (openff.toolkit.topology.Molecule) – An OpenFF molecule

Examples

Create a molecule from an RDKit molecule

>>> from rdkit import Chem
>>> from openff.toolkit.tests.utils import get_data_file_path
>>> rdmol = Chem.MolFromMolFile(get_data_file_path('systems/monomers/ethanol.sdf'))
>>> molecule = Molecule.from_rdkit(rdmol)
classmethod from_smiles(smiles, hydrogens_are_explicit=False, toolkit_registry=<ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit>, allow_undefined_stereo=False)

Construct a Molecule from a SMILES representation

Parameters
  • smiles (str) – The SMILES representation of the molecule.

  • hydrogens_are_explicit (bool, default = False) – If False, the cheminformatics toolkit will perform hydrogen addition

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

  • allow_undefined_stereo (bool, default=False) – Whether to accept SMILES with undefined stereochemistry. If False, an exception will be raised if a SMILES with undefined stereochemistry is passed into this function.

Returns

molecule (openff.toolkit.topology.Molecule)

Examples

>>> molecule = Molecule.from_smiles('Cc1ccccc1')
classmethod from_toml(serialized)

Instantiate an object from a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Parameters

serlialized (str) – A TOML serialized representation of the object

Returns

instance (cls) – An instantiated object

classmethod from_topology(topology)

Return a Molecule representation of an OpenFF Topology containing a single Molecule object.

Parameters

topology (Topology) – The Topology object containing a single Molecule object. Note that OpenMM and MDTraj Topology objects are not supported.

Returns

molecule (openff.toolkit.topology.Molecule) – The Molecule object in the topology

Raises

ValueError – If the topology does not contain exactly one molecule.

Examples

Create a molecule from a Topology object that contains exactly one molecule

>>> molecule = Molecule.from_topology(topology)  
classmethod from_xml(serialized)

Instantiate an object from an XML serialized representation.

Specification: https://www.w3.org/XML/

Parameters

serialized (bytes) – An XML serialized representation

Returns

instance (cls) – Instantiated object.

classmethod from_yaml(serialized)

Instantiate from a YAML serialized representation.

Specification: http://yaml.org/

Parameters

serialized (str) – A YAML serialized representation of the object

Returns

instance (cls) – Instantiated object

generate_conformers(toolkit_registry=GLOBAL_TOOLKIT_REGISTRY, n_conformers=10, rms_cutoff=None, clear_existing=True, make_carboxylic_acids_cis=True)

Generate conformers for this molecule using an underlying toolkit.

If n_conformers=0, no toolkit wrapper will be called. If n_conformers=0 and clear_existing=True, molecule.conformers will be set to None.

Parameters
  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

  • n_conformers (int, default=1) – The maximum number of conformers to produce

  • rms_cutoff (openmm.unit.Quantity-wrapped float, in units of distance, optional, default=None) – The minimum RMS value at which two conformers are considered redundant and one is deleted. Precise implementation of this cutoff may be toolkit-dependent. If None, the cutoff is set to be the default value for each ToolkitWrapper (generally 1 Angstrom).

  • clear_existing (bool, default=True) – Whether to overwrite existing conformers for the molecule

  • make_carboxylic_acids_cis (bool, default=True) – Guarantee all conformers have exclusively cis carboxylic acid groups (COOH) by rotating the proton in any trans carboxylic acids 180 degrees around the C-O bond. Works around a bug in conformer generation by the OpenEye toolkit where trans COOH is much more common than it should be.

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

generate_unique_atom_names()

Generate unique atom names using element name and number of times that element has occurred e.g. ‘C1x’, ‘H1x’, ‘O1x’, ‘C2x’, …

The character ‘x’ is appended to these generated names to reduce the odds that they clash with an atom name or type imported from another source.

get_bond_between(i, j)

Returns the bond between two atoms

Parameters
  • i (int or Atom) – Atoms or atom indices to check

  • j (int or Atom) – Atoms or atom indices to check

Returns

bond (Bond) – The bond between i and j.

property has_unique_atom_names: bool

True if the molecule has unique atom names, False otherwise.

property hierarchy_schemes: Dict[str, HierarchyScheme]

The hierarchy schemes available on the molecule.

Hierarchy schemes allow iteration over groups of atoms according to their metadata. For more information, see HierarchyScheme.

Returns

A dict of the form {str (HierarchyScheme}) – The HierarchySchemes associated with the molecule.

property hill_formula: str

Get the Hill formula of the molecule

property impropers: Set[Tuple[Atom, Atom, Atom, Atom]]

Iterate over all improper torsions in the molecule.

Returns

impropers (set of tuple) – An iterator of tuples, each containing the atoms making up a possible improper torsion.

is_isomorphic_with(other, **kwargs)

Check if the molecule is isomorphic with the other molecule which can be an openff.toolkit.topology.Molecule or nx.Graph(). Full matching is done using the options described bellow.

Warning

This API is experimental and subject to change.

Parameters
  • other (Molecule or nx.Graph()) –

  • aromatic_matching (bool, default=True, optional) –

  • atoms. (compare the formal charges attributes of the) –

  • formal_charge_matching (bool, default=True, optional) –

  • atoms.

  • bond_order_matching (bool, deafult=True, optional) –

  • bonds. (compare the bond order on attributes of the) –

  • atom_stereochemistry_matching (bool, default=True, optional) – If False, atoms’ stereochemistry is ignored for the purpose of determining equality.

  • bond_stereochemistry_matching (bool, default=True, optional) – If False, bonds’ stereochemistry is ignored for the purpose of determining equality.

  • strip_pyrimidal_n_atom_stereo (bool, default=True, optional) – If True, any stereochemistry defined around pyrimidal nitrogen stereocenters will be disregarded in the isomorphism check.

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for removing stereochemistry from pyrimidal nitrogens.

Returns

isomorphic (bool)

property n_angles: int

Number of angles in the molecule.

property n_atoms: int

The number of Atom objects.

property n_bonds

The number of Bond objects in the molecule.

property n_conformers: int

The number of conformers for this molecule.

property n_impropers: int

Number of possible improper torsions in the molecule.

property n_particles: int

Use Molecule.n_atoms instead.

Type

DEPRECATED

property n_propers: int

Number of proper torsions in the molecule.

property name: str

The name (or title) of the molecule

nth_degree_neighbors(n_degrees)

Return canonicalized pairs of atoms whose shortest separation is exactly n bonds. Only pairs with increasing atom indices are returned.

Parameters

n (int) – The number of bonds separating atoms in each pair

Returns

neighbors (iterator of tuple of Atom) – Tuples (len 2) of atom that are separated by n bonds.

Notes

The criteria used here relies on minimum distances; when there are multiple valid paths between atoms, such as atoms in rings, the shortest path is considered. For example, two atoms in “meta” positions with respect to each other in a benzene are separated by two paths, one length 2 bonds and the other length 4 bonds. This function would consider them to be 2 apart and would not include them if n=4 was passed.

ordered_connection_table_hash()

Compute an ordered hash of the atoms and bonds in the molecule

property partial_charges

Returns the partial charges (if present) on the molecule.

Returns

partial_charges (a openmm.unit.Quantity - wrapped numpy array [1 x n_atoms] or None) – The partial charges on the molecule’s atoms. Returns None if no charges have been specified.

particle(index: int) Atom

DEPRECATED: Use Molecule.atom instead.

particle_index(particle: Atom) int

DEPRECATED: Use Molecule.atom_index instead.

property particles: List[Atom]

Use Molecule.atoms instead.

Type

DEPRECATED

property propers: Set[Tuple[Atom, Atom, Atom, Atom]]

Iterate over all proper torsions in the molecule

property properties: Dict[str, Any]

The properties dictionary of the molecule

remap(mapping_dict, current_to_new=True)

Remap all of the indexes in the molecule to match the given mapping dict

Warning

This API is experimental and subject to change.

Parameters
  • mapping_dict (dict,) – A dictionary of the mapping between indexes, this should start from 0.

  • current_to_new (bool, default=True) – If this is True, then mapping_dict is of the form {current_index: new_index}; otherwise, it is of the form {new_index: current_index}

Returns

new_molecule (openff.toolkit.topology.molecule.Molecule) – An openff.toolkit.Molecule instance with all attributes transferred, in the PDB order.

property smirnoff_impropers: Set[Tuple[Atom, Atom, Atom, Atom]]

Iterate over all impropers with trivalent centers, reporting the central atom second.

The central atom is reported second in each torsion. This method reports an improper for each trivalent atom in the molecule, whether or not any given force field would assign it improper torsion parameters.

Also note that this will return 6 possible atom orderings around each improper center. In current SMIRNOFF parameterization, three of these six orderings will be used for the actual assignment of the improper term and measurement of the angles. These three orderings capture the three unique angles that could be calculated around the improper center, therefore the sum of these three terms will always return a consistent energy.

The exact three orderings that will be applied during parameterization can not be determined in this method, since it requires sorting the atom indices, and those indices may change when this molecule is added to a Topology.

For more details on the use of three-fold (‘trefoil’) impropers, see https://openforcefield.github.io/standards/standards/smirnoff/#impropertorsions

Returns

impropers (set of tuple) – An iterator of tuples, each containing the indices of atoms making up a possible improper torsion. The central atom is listed second in each tuple.

strip_atom_stereochemistry(smarts, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Delete stereochemistry information for certain atoms, if it is present. This method can be used to “normalize” molecules imported from different cheminformatics toolkits, which differ in which atom centers are considered stereogenic.

Parameters
  • smarts (str or ChemicalEnvironment) – Tagged SMARTS with a single atom with index 1. Any matches for this atom will have any assigned stereocheistry information removed.

  • toolkit_registry (a ToolkitRegistry or ToolkitWrapper object, optional,) – default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use for I/O operations

to_bson()

Return a BSON serialized representation.

Specification: http://bsonspec.org/

Returns

serialized (bytes) – A BSON serialized representation of the objecft

to_dict()

Return a dictionary representation of the molecule.

Returns

molecule_dict (OrderedDict) – A dictionary representation of the molecule.

to_file(file_path, file_format, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Write the current molecule to a file or file-like object

Parameters
  • file_path (str or file-like object) – A file-like object or the path to the file to be written.

  • file_format (str) – Format specifier, one of [‘MOL2’, ‘MOL2H’, ‘SDF’, ‘PDB’, ‘SMI’, ‘CAN’, ‘TDT’] Note that not all toolkits support all formats

  • toolkit_registry (ToolkitRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY ToolkitRegistry or ToolkitWrapper to use for file writing. If a Toolkit is passed, only the highest-precedence toolkit is used

Raises

ValueError – If the requested file_format is not supported by one of the installed cheminformatics toolkits

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> molecule.to_file('imatinib.mol2', file_format='mol2')  
>>> molecule.to_file('imatinib.sdf', file_format='sdf')  
>>> molecule.to_file('imatinib.pdb', file_format='pdb')  
to_hill_formula() str

Generate the Hill formula of this molecule.

Returns

formula (the Hill formula of the molecule)

:raises NotImplementedError : if the molecule is not of one of the specified types.:

to_inchi(fixed_hydrogens=False, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Create an InChI string for the molecule using the requested toolkit backend. InChI is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters
  • fixed_hydrogens (bool, default=False) – If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for molecule-to-InChI conversion

Returns

inchi (str) – The InChI string of the molecule.

Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

to_inchikey(fixed_hydrogens=False, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Create an InChIKey for the molecule using the requested toolkit backend. InChIKey is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters
  • fixed_hydrogens (bool, default=False) – If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolRegistry) – or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for molecule-to-InChIKey conversion

Returns

inchi_key (str) – The InChIKey representation of the molecule.

Raises

InvalidToolkitRegistryError – If an invalid object is passed as the toolkit_registry parameter

to_iupac(toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Generate IUPAC name from Molecule

Returns

  • iupac_name (str) – IUPAC name of the molecule

  • .. note :: This method requires the OpenEye toolkit to be installed.

Examples

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> iupac_name = molecule.to_iupac()
to_json(indent=None) str

Return a JSON serialized representation.

Specification: https://www.json.org/

Parameters

indent (int, optional, default=None) – If not None, will pretty-print with specified number of spaces for indentation

Returns

serialized (str) – A JSON serialized representation of the object

to_messagepack()

Return a MessagePack representation.

Specification: https://msgpack.org/index.html

Returns

serialized (bytes) – A MessagePack-encoded bytes serialized representation of the object

to_networkx()

Generate a NetworkX undirected graph from the molecule.

Nodes are Atoms labeled with atom indices and atomic elements (via the element node atrribute). Edges denote chemical bonds between Atoms.

Returns

graph (networkx.Graph) – The resulting graph, with nodes (atoms) labeled with atom indices, elements, stereochemistry and aromaticity flags and bonds with two atom indices, bond order, stereochemistry, and aromaticity flags

Examples

Retrieve the bond graph for imatinib (OpenEye toolkit required)

>>> molecule = Molecule.from_iupac('imatinib')
>>> nxgraph = molecule.to_networkx()
to_openeye(toolkit_registry=GLOBAL_TOOLKIT_REGISTRY, aromaticity_model=DEFAULT_AROMATICITY_MODEL)

Create an OpenEye molecule

Requires the OpenEye toolkit to be installed.

Parameters

aromaticity_model (str, optional, default=DEFAULT_AROMATICITY_MODEL) – The aromaticity model to use

Returns

oemol (openeye.oechem.OEMol) – An OpenEye molecule

Examples

Create an OpenEye molecule from a Molecule

>>> molecule = Molecule.from_smiles('CC')
>>> oemol = molecule.to_openeye()
to_pickle()

Return a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Returns

serialized (str) – A pickled representation of the object

to_qcschema(multiplicity=1, conformer=0, extras=None)

Create a QCElemental Molecule.

Warning

This API is experimental and subject to change.

Parameters
  • multiplicity (int, default=1,) – The multiplicity of the molecule; sets molecular_multiplicity field for QCElemental Molecule.

  • conformer (int, default=0,) – The index of the conformer to use for the QCElemental Molecule geometry.

  • extras (dict, default=None) – A dictionary that should be included in the extras field on the QCElemental Molecule. This can be used to include extra information, such as a smiles representation.

Returns

qcelemental.models.Molecule – A validated QCElemental Molecule.

Examples

Create a QCElemental Molecule:

>>> import qcelemental as qcel
>>> mol = Molecule.from_smiles('CC')
>>> mol.generate_conformers(n_conformers=1)
>>> qcemol = mol.to_qcschema()
Raises
  • MissingOptionalDependencyError – If qcelemental is not installed, the qcschema can not be validated.

  • InvalidConformerError – No conformer found at the given index.

to_rdkit(aromaticity_model=DEFAULT_AROMATICITY_MODEL, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Create an RDKit molecule

Requires the RDKit to be installed.

Parameters

aromaticity_model (str, optional, default=DEFAULT_AROMATICITY_MODEL) – The aromaticity model to use

Returns

rdmol (rdkit.RDMol) – An RDKit molecule

Examples

Convert a molecule to RDKit

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> rdmol = molecule.to_rdkit()
to_smiles(isomeric=True, explicit_hydrogens=True, mapped=False, toolkit_registry=GLOBAL_TOOLKIT_REGISTRY)

Return a canonical isomeric SMILES representation of the current molecule. A partially mapped smiles can also be generated for atoms of interest by supplying an atom_map to the properties dictionary.

Note

RDKit and OpenEye versions will not necessarily return the same representation.

Parameters
  • isomeric (bool optional, default= True) – return an isomeric smiles

  • explicit_hydrogens (bool optional, default=True) – return a smiles string containing all hydrogens explicitly

  • mapped (bool optional, default=False) – return a explicit hydrogen mapped smiles, the atoms to be mapped can be controlled by supplying an atom map into the properties dictionary. If no mapping is passed all atoms will be mapped in order, else an atom map dictionary from the current atom index to the map id should be supplied with no duplicates. The map ids (values) should start from 0 or 1.

  • toolkit_registry (openff.toolkit.utils.toolkits.ToolkitRegistry or) – openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None ToolkitRegistry or ToolkitWrapper to use for SMILES conversion

Returns

smiles (str) – Canonical isomeric explicit-hydrogen SMILES

Examples

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> smiles = molecule.to_smiles()
to_toml()

Return a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Returns

serialized (str) – A TOML serialized representation of the object

to_topology()

Return an OpenFF Topology representation containing one copy of this molecule

Returns

topology (openff.toolkit.topology.Topology) – A Topology representation of this molecule

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> topology = molecule.to_topology()
to_xml(indent=2)

Return an XML representation.

Specification: https://www.w3.org/XML/

Parameters

indent (int, optional, default=2) – If not None, will pretty-print with specified number of spaces for indentation

Returns

serialized (bytes) – A MessagePack-encoded bytes serialized representation.

to_yaml()

Return a YAML serialized representation.

Specification: http://yaml.org/

Returns

serialized (str) – A YAML serialized representation of the object

property torsions: Set[Tuple[Atom, Atom, Atom, Atom]]

Get an iterator over all i-j-k-l torsions. Note that i-j-k-i torsions (cycles) are excluded.

Returns

torsions (iterable of 4-Atom tuples)

property total_charge

Return the total charge on the molecule

update_hierarchy_schemes(iter_names=None)

Infer a hierarchy from atom metadata according to the existing hierarchy schemes.

Hierarchy schemes allow iteration over groups of atoms according to their metadata. For more information, see HierarchyScheme.

Parameters

iter_names (Iterable of str, Optional) – Only perceive hierarchy for HierarchySchemes that expose these iterator names. If not provided, all known hierarchies will be perceived, overwriting previous results if applicable.