openff.toolkit.topology.Molecule

class openff.toolkit.topology.Molecule(*args, **kwargs)[source]

Mutable chemical representation of a molecule, such as a small molecule or biopolymer.

Examples

Create a molecule from an sdf file

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule

>>> molecule = Molecule.from_openeye(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule

>>> molecule = Molecule.from_rdkit(rdmol)

Create a molecule from IUPAC name (requires the OpenEye toolkit)

>>> molecule = Molecule.from_iupac('imatinib')

Create a molecule from SMILES

>>> molecule = Molecule.from_smiles('Cc1ccccc1')

Warning

This API is experimental and subject to change.

__init__(*args, **kwargs)[source]

Create a new Molecule object

Parameters
otheroptional, default=None

If specified, attempt to construct a copy of the Molecule from the specified object. This can be any one of the following:

  • a Molecule object

  • a file that can be used to construct a Molecule object

  • an openeye.oechem.OEMol

  • an rdkit.Chem.rdchem.Mol

  • a serialized Molecule object

Examples

Create an empty molecule:

>>> empty_molecule = Molecule()

Create a molecule from a file that can be used to construct a molecule, using either a filename or file-like object:

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> molecule = Molecule(open(sdf_filepath, 'r'), file_format='sdf')
>>> import gzip
>>> mol2_gz_filepath = get_data_file_path('molecules/toluene.mol2.gz')
>>> molecule = Molecule(gzip.GzipFile(mol2_gz_filepath, 'r'), file_format='mol2')

Create a molecule from another molecule:

>>> molecule_copy = Molecule(molecule)

Convert to OpenEye OEMol object

>>> oemol = molecule.to_openeye()

Create a molecule from an OpenEye molecule:

>>> molecule = Molecule(oemol)

Convert to RDKit Mol object

>>> rdmol = molecule.to_rdkit()

Create a molecule from an RDKit molecule:

>>> molecule = Molecule(rdmol)

Create a molecule from a serialized molecule object:

>>> serialized_molecule = molecule.__getstate__()
>>> molecule_copy = Molecule(serialized_molecule)

Methods

__init__(*args, **kwargs)

Create a new Molecule object

add_atom(atomic_number, formal_charge, ...)

Add an atom

add_bond(atom1, atom2, bond_order, is_aromatic)

Add a bond between two specified atom indices

add_bond_charge_virtual_site(atoms, ...)

Add a virtual site representing the charge on a bond.

add_conformer(coordinates)

Add a conformation of the molecule

add_divalent_lone_pair_virtual_site(atoms, ...)

Create a divalent lone pair-type virtual site, in which the location of the charge is specified by the position of three atoms.

add_monovalent_lone_pair_virtual_site(atoms, ...)

Create a bond charge-type virtual site, in which the location of the charge is specified by the position of three atoms.

add_trivalent_lone_pair_virtual_site(atoms, ...)

Create a trivalent lone pair-type virtual site, in which the location of the charge is specified by the position of four atoms.

apply_elf_conformer_selection([percentage, ...])

Applies the ELF method to select a set of diverse conformers which have minimal electrostatically strongly interacting functional groups from a molecules conformers.

are_isomorphic(mol1, mol2[, ...])

Determines whether the two molecules are isomorphic by comparing their graph representations and the chosen node/edge attributes.

assign_fractional_bond_orders([...])

Update and store list of bond orders this molecule.

assign_partial_charges(partial_charge_method)

Calculate partial atomic charges for this molecule using an underlying toolkit, and assign the new values to the partial_charges attribute.

canonical_order_atoms([toolkit_registry])

Canonical order the atoms in a copy of the molecule using a toolkit, returns a new copy.

chemical_environment_matches(query[, ...])

Retrieve all matches for a given chemical environment query.

compute_partial_charges_am1bcc([...])

Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit and assign them to this molecule's partial_charges attribute.

compute_virtual_site_positions_from_atom_positions(...)

Compute the positions of the virtual sites in this molecule given a set of external coordinates.

compute_virtual_site_positions_from_conformer(...)

Compute the position of all virtual sites given an existing conformer specified by its index.

enumerate_protomers([max_states])

Enumerate the formal charges of a molecule to generate different protomoers.

enumerate_stereoisomers([undefined_only, ...])

Enumerate the stereocenters and bonds of the current molecule.

enumerate_tautomers([max_states, ...])

Enumerate the possible tautomers of the current molecule

find_rotatable_bonds([...])

Find all bonds classed as rotatable ignoring any matched to the ignore_functional_groups list.

from_bson(serialized)

Instantiate an object from a BSON serialized representation.

from_dict(molecule_dict)

Create a new Molecule from a dictionary representation

from_file(file_path[, file_format, ...])

Create one or more molecules from a file

from_inchi(inchi[, allow_undefined_stereo, ...])

Construct a Molecule from a InChI representation

from_iupac(iupac_name[, toolkit_registry, ...])

Generate a molecule from IUPAC or common name

from_json(serialized)

Instantiate an object from a JSON serialized representation.

from_mapped_smiles(mapped_smiles[, ...])

Create an Molecule from a mapped SMILES made with cmiles.

from_messagepack(serialized)

Instantiate an object from a MessagePack serialized representation.

from_openeye(oemol[, allow_undefined_stereo])

Create a Molecule from an OpenEye molecule.

from_pdb_and_smiles(file_path, smiles[, ...])

Create a Molecule from a pdb file and a SMILES string using RDKit.

from_pickle(serialized)

Instantiate an object from a pickle serialized representation.

from_qcschema(qca_record[, client, ...])

Create a Molecule from a QCArchive molecule record or dataset entry based on attached cmiles information.

from_rdkit(rdmol[, allow_undefined_stereo, ...])

Create a Molecule from an RDKit molecule.

from_smiles(smiles[, ...])

Construct a Molecule from a SMILES representation

from_toml(serialized)

Instantiate an object from a TOML serialized representation.

from_topology(topology)

Return a Molecule representation of an OpenFF Topology containing a single Molecule object.

from_xml(serialized)

Instantiate an object from an XML serialized representation.

from_yaml(serialized)

Instantiate from a YAML serialized representation.

generate_conformers([toolkit_registry, ...])

Generate conformers for this molecule using an underlying toolkit.

generate_unique_atom_names()

Generate unique atom names using element name and number of times that element has occurred e.g.

get_bond_between(i, j)

Returns the bond between two atoms

is_isomorphic_with(other, **kwargs)

Check if the molecule is isomorphic with the other molecule which can be an openff.toolkit.topology.Molecule, or TopologyMolecule or nx.Graph().

nth_degree_neighbors(n_degrees)

Return canonicalized pairs of atoms whose shortest separation is exactly n bonds.

remap(mapping_dict[, current_to_new])

Remap all of the indexes in the molecule to match the given mapping dict

strip_atom_stereochemistry(smarts[, ...])

Delete stereochemistry information for certain atoms, if it is present.

to_bson()

Return a BSON serialized representation.

to_dict()

Return a dictionary representation of the molecule.

to_file(file_path, file_format[, ...])

Write the current molecule to a file or file-like object

to_hill_formula(molecule)

Generate the Hill formula from either a FrozenMolecule, TopologyMolecule or nx.Graph() of the molecule

to_inchi([fixed_hydrogens, toolkit_registry])

Create an InChI string for the molecule using the requested toolkit backend.

to_inchikey([fixed_hydrogens, toolkit_registry])

Create an InChIKey for the molecule using the requested toolkit backend.

to_iupac([toolkit_registry])

Generate IUPAC name from Molecule

to_json([indent])

Return a JSON serialized representation.

to_messagepack()

Return a MessagePack representation.

to_networkx()

Generate a NetworkX undirected graph from the Molecule.

to_openeye([aromaticity_model])

Create an OpenEye molecule

to_pickle()

Return a pickle serialized representation.

to_qcschema([multiplicity, conformer, extras])

Create a QCElemental Molecule.

to_rdkit([aromaticity_model])

Create an RDKit molecule

to_smiles([isomeric, explicit_hydrogens, ...])

Return a canonical isomeric SMILES representation of the current molecule.

to_toml()

Return a TOML serialized representation.

to_topology()

Return an OpenFF Topology representation containing one copy of this molecule

to_xml([indent])

Return an XML representation.

to_yaml()

Return a YAML serialized representation.

visualize([backend, width, height, ...])

Render a visualization of the molecule in Jupyter

Attributes

amber_impropers

Iterate over improper torsions in the molecule, but only those with trivalent centers, reporting the central atom first in each improper.

angles

Get an iterator over all i-j-k angles.

atoms

Iterate over all Atom objects.

bonds

Iterate over all Bond objects.

conformers

Returns the list of conformers for this molecule.

has_unique_atom_names

True if the molecule has unique atom names, False otherwise.

hill_formula

Get the Hill formula of the molecule

impropers

Iterate over all improper torsions in the molecule.

n_angles

int: number of angles in the Molecule.

n_atoms

The number of Atom objects.

n_bonds

The number of Bond objects.

n_conformers

Returns the number of conformers for this molecule.

n_impropers

int: number of possible improper torsions in the Molecule.

n_particles

The number of Particle objects, which corresponds to how many positions must be used.

n_propers

int: number of proper torsions in the Molecule.

n_rings

Return the number of rings found in the Molecule

n_virtual_particles

The number of VirtualParticle objects.

n_virtual_sites

The number of VirtualSite objects.

name

The name (or title) of the molecule

partial_charges

Returns the partial charges (if present) on the molecule.

particles

Iterate over all Particle objects.

propers

Iterate over all proper torsions in the molecule

properties

The properties dictionary of the molecule

rings

Return the number of rings in this molecule.

smirnoff_impropers

Iterate over improper torsions in the molecule, but only those with trivalent centers, reporting the central atom second in each improper.

torsions

Get an iterator over all i-j-k-l torsions.

total_charge

Return the total charge on the molecule

virtual_sites

Iterate over all VirtualSite objects.

property amber_impropers

Iterate over improper torsions in the molecule, but only those with trivalent centers, reporting the central atom first in each improper.

Note that it’s possible that a trivalent center will not have an improper assigned. This will depend on the force field that is used.

Also note that this will return 6 possible atom orderings around each improper center. In current AMBER parameterization, one of these six orderings will be used for the actual assignment of the improper term and measurement of the angle. This method does not encode the logic to determine which of the six orderings AMBER would use.

Returns
impropersset of tuple

An iterator of tuples, each containing the indices of atoms making up a possible improper torsion. The central atom is listed first in each tuple.

property angles

Get an iterator over all i-j-k angles.

apply_elf_conformer_selection(percentage: float = 2.0, limit: int = 10, toolkit_registry: Optional[Union[openff.toolkit.utils.toolkit_registry.ToolkitRegistry, openff.toolkit.utils.base_wrapper.ToolkitWrapper]] = ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, **kwargs)

Applies the ELF method to select a set of diverse conformers which have minimal electrostatically strongly interacting functional groups from a molecules conformers.

Parameters
toolkit_registry

The underlying toolkit to use to select the ELF conformers.

percentage

The percentage of conformers with the lowest electrostatic interaction energies to greedily select from.

limit

The maximum number of conformers to select.

See also

OpenEyeToolkitWrapper.apply_elf_conformer_selection
RDKitToolkitWrapper.apply_elf_conformer_selection

Notes

  • The input molecule should have a large set of conformers already generated to select the ELF conformers from.

  • The selected conformers will be retained in the conformers list while unselected conformers will be discarded.

static are_isomorphic(mol1, mol2, return_atom_map=False, aromatic_matching=True, formal_charge_matching=True, bond_order_matching=True, atom_stereochemistry_matching=True, bond_stereochemistry_matching=True, strip_pyrimidal_n_atom_stereo=True, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Determines whether the two molecules are isomorphic by comparing their graph representations and the chosen node/edge attributes. Minimally connections and atomic_number are checked.

If nx.Graphs() are given they must at least have atomic_number attributes on nodes. other optional attributes for nodes are: is_aromatic, formal_charge and stereochemistry. optional attributes for edges are: is_aromatic, bond_order and stereochemistry.

Warning

This API is experimental and subject to change.

Parameters
mol1an openff.toolkit.topology.molecule.FrozenMolecule or TopologyMolecule or nx.Graph()
mol2an openff.toolkit.topology.molecule.FrozenMolecule or TopologyMolecule or nx.Graph()

The molecule to test for isomorphism.

return_atom_map: bool, default=False, optional

will return an optional dict containing the atomic mapping.

aromatic_matching: bool, default=True, optional

compare the aromatic attributes of bonds and atoms.

formal_charge_matching: bool, default=True, optional

compare the formal charges attributes of the atoms.

bond_order_matching: bool, deafult=True, optional

compare the bond order on attributes of the bonds.

atom_stereochemistry_matchingbool, default=True, optional

If False, atoms’ stereochemistry is ignored for the purpose of determining equality.

bond_stereochemistry_matchingbool, default=True, optional

If False, bonds’ stereochemistry is ignored for the purpose of determining equality.

strip_pyrimidal_n_atom_stereo: bool, default=True, optional

If True, any stereochemistry defined around pyrimidal nitrogen stereocenters will be disregarded in the isomorphism check.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for removing stereochemistry from pyrimidal nitrogens.

Returns
molecules_are_isomorphicbool
atom_mapdefault=None, Optional,

[Dict[int,int]] ordered by mol1 indexing {mol1_index: mol2_index} If molecules are not isomorphic given input arguments, will return None instead of dict.

assign_fractional_bond_orders(bond_order_model=None, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, use_conformers=None)

Update and store list of bond orders this molecule. Bond orders are stored on each bond, in the bond.fractional_bond_order attribute.

Warning

This API is experimental and subject to change.

Parameters
toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

bond_order_modelstring, optional. Default=None

The bond order model to use for fractional bond order calculation. If None, “am1-wiberg” will be used.

use_conformersiterable of openmm.unit.Quantity(np.array) with shape (n_atoms, 3) and dimension of distance, optional, default=None

The conformers to use for fractional bond order calculation. If None, an appropriate number of conformers will be generated by an available ToolkitWrapper.

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.assign_fractional_bond_orders()
assign_partial_charges(partial_charge_method, strict_n_conformers=False, use_conformers=None, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, normalize_partial_charges=True)

Calculate partial atomic charges for this molecule using an underlying toolkit, and assign the new values to the partial_charges attribute.

Parameters
partial_charge_methodstring

The partial charge calculation method to use for partial charge calculation.

strict_n_conformersbool, default=False

Whether to raise an exception if an invalid number of conformers is provided for the given charge method. If this is False and an invalid number of conformers is found, a warning will be raised.

use_conformersiterable of openmm.unit.Quantity-wrapped numpy arrays, each with shape (n_atoms, 3) and dimension of distance. Optional, default=None

Coordinates to use for partial charge calculation. If None, an appropriate number of conformers will be generated.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for the calculation.

normalize_partial_chargesbool, default=True

Whether to offset partial charges so that they sum to the total formal charge of the molecule. This is used to prevent accumulation of rounding errors when the partial charge assignment method returns values at limited precision.

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.assign_partial_charges('am1-mulliken')
property atoms

Iterate over all Atom objects.

property bonds

Iterate over all Bond objects.

canonical_order_atoms(toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Canonical order the atoms in a copy of the molecule using a toolkit, returns a new copy.

Warning

This API is experimental and subject to change.

Parameters
toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

Returns
moleculeopenff.toolkit.topology.Molecule

An new OpenFF style molecule with atoms in the canonical order.

chemical_environment_matches(query, unique=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Retrieve all matches for a given chemical environment query.

Parameters
querystr or ChemicalEnvironment

SMARTS string (with one or more tagged atoms) or ChemicalEnvironment query. Query will internally be resolved to SMIRKS using query.asSMIRKS() if it has an .asSMIRKS method.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for chemical environment matches

Returns
matcheslist of atom index tuples

A list of tuples, containing the indices of the matching atoms.

Examples

Retrieve all the carbon-carbon bond matches in a molecule

>>> molecule = Molecule.from_iupac('imatinib')
>>> matches = molecule.chemical_environment_matches('[#6X3:1]~[#6X3:2]')
compute_partial_charges_am1bcc(use_conformers=None, strict_n_conformers=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Calculate partial atomic charges for this molecule using AM1-BCC run by an underlying toolkit and assign them to this molecule’s partial_charges attribute.

Parameters
strict_n_conformersbool, default=False

Whether to raise an exception if an invalid number of conformers is provided for the given charge method. If this is False and an invalid number of conformers is found, a warning will be raised.

use_conformersiterable of openmm.unit.Quantity-wrapped numpy arrays, each with shape (n_atoms, 3) and dimension of distance. Optional, default=None

Coordinates to use for partial charge calculation. If None, an appropriate number of conformers for the given charge method will be generated.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for the calculation

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
>>> molecule.compute_partial_charges_am1bcc()
compute_virtual_site_positions_from_atom_positions(atom_positions)

Compute the positions of the virtual sites in this molecule given a set of external coordinates. The coordinates do not need come from an internal conformer, but are assumed to have the same shape and be in the same order.

Parameters
atom_positionsopenmm.unit.Quantity of dimension [Length] wrapping a
numpy.ndarray

The positions of all atoms in the molecule. The array is the size (N, 3) where N is the number of atoms in the molecule.

Returns
openmm.unit.Quantity of dimension [Length] in unit Angstroms wrapping a
numpy.ndarray

The positions of the virtual particles belonging to this virtual site. The array is the size (M, 3) where M is the number of virtual particles belonging to this virtual site.

compute_virtual_site_positions_from_conformer(conformer_idx)

Compute the position of all virtual sites given an existing conformer specified by its index.

Parameters
conformer_idxint

The index of the conformer.

Returns
openmm.unit.Quantity of dimension [Length] in unit Angstroms wrapping a
numpy.ndarray

The positions of the virtual particles belonging to this virtual site. The array is the size (M, 3) where M is the number of virtual particles belonging to this virtual site.

property conformers

Returns the list of conformers for this molecule. This returns a list of openmm.unit.Quantity-wrapped numpy arrays, of shape (3 x n_atoms) and with dimensions of distance. The return value is the actual list of conformers, and changes to the contents affect the original FrozenMolecule.

enumerate_protomers(max_states=10)

Enumerate the formal charges of a molecule to generate different protomoers.

Parameters
max_states: int optional, default=10,

The maximum number of protomer states to be returned.

Returns
molecules: List[openff.toolkit.topology.Molecule],

A list of the protomers of the input molecules not including the input.

enumerate_stereoisomers(undefined_only=False, max_isomers=20, rationalise=True, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Enumerate the stereocenters and bonds of the current molecule.

Parameters
undefined_only: bool optional, default=False

If we should enumerate all stereocenters and bonds or only those with undefined stereochemistry

max_isomers: int optional, default=20

The maximum amount of molecules that should be returned

rationalise: bool optional, default=True

If we should try to build and rationalise the molecule to ensure it can exist

toolkit_registry: openff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use to enumerate the stereoisomers.

Returns
molecules: List[openff.toolkit.topology.Molecule]

A list of Molecule instances not including the input molecule.

enumerate_tautomers(max_states=20, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Enumerate the possible tautomers of the current molecule

Parameters
max_states: int optional, default=20

The maximum amount of molecules that should be returned

toolkit_registry: openff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use to enumerate the tautomers.

Returns
molecules: List[openff.toolkit.topology.Molecule]

A list of openff.toolkit.topology.Molecule instances not including the input molecule.

find_rotatable_bonds(ignore_functional_groups=None, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Find all bonds classed as rotatable ignoring any matched to the ignore_functional_groups list.

Parameters
ignore_functional_groups: optional, List[str], default=None,

A list of bond SMARTS patterns to be ignored when finding rotatable bonds.

toolkit_registry: openff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMARTS matching

Returns
bonds: List[openff.toolkit.topology.molecule.Bond]

The list of openff.toolkit.topology.molecule.Bond instances which are rotatable.

classmethod from_bson(serialized)

Instantiate an object from a BSON serialized representation.

Specification: http://bsonspec.org/

Parameters
serializedbytes

A BSON serialized representation of the object

Returns
instancecls

An instantiated object

classmethod from_dict(molecule_dict)

Create a new Molecule from a dictionary representation

Parameters
molecule_dictOrderedDict

A dictionary representation of the molecule.

Returns
moleculeMolecule

A Molecule created from the dictionary representation

classmethod from_file(file_path, file_format=None, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, allow_undefined_stereo=False)

Create one or more molecules from a file

Parameters
file_pathstr or file-like object

The path to the file or file-like object to stream one or more molecules from.

file_formatstr, optional, default=None

Format specifier, usually file suffix (eg. ‘MOL2’, ‘SMI’) Note that not all toolkits support all formats. Check ToolkitWrapper.toolkit_file_read_formats for your loaded toolkits for details.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper,
optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for file loading. If a Toolkit is passed, only the highest-precedence toolkit is used

allow_undefined_stereobool, default=False

If false, raises an exception if oemol contains undefined stereochemistry.

Returns
moleculesMolecule or list of Molecules

If there is a single molecule in the file, a Molecule is returned; otherwise, a list of Molecule objects is returned.

Examples

>>> from openff.toolkit.tests.utils import get_monomer_mol2_file_path
>>> mol2_file_path = get_monomer_mol2_file_path('cyclohexane')
>>> molecule = Molecule.from_file(mol2_file_path)
classmethod from_inchi(inchi, allow_undefined_stereo=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Construct a Molecule from a InChI representation

Parameters
inchistr

The InChI representation of the molecule.

allow_undefined_stereobool, default=False

Whether to accept InChI with undefined stereochemistry. If False, an exception will be raised if a InChI with undefined stereochemistry is passed into this function.

toolkit_registryopenff.toolkit.utils.toolkits.ToolRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for InChI-to-molecule conversion

Returns
moleculeopenff.toolkit.topology.Molecule

Examples

Make cis-1,2-Dichloroethene:

>>> molecule = Molecule.from_inchi('InChI=1S/C2H2Cl2/c3-1-2-4/h1-2H/b2-1-')
classmethod from_iupac(iupac_name, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, allow_undefined_stereo=False, **kwargs)

Generate a molecule from IUPAC or common name

Parameters
iupac_namestr

IUPAC name of molecule to be generated

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for chemical environment matches

allow_undefined_stereobool, default=False

If false, raises an exception if molecule contains undefined stereochemistry.

Returns
moleculeMolecule

The resulting molecule with position

Note

This method requires the OpenEye toolkit to be installed. ..

Examples

Create a molecule from an IUPAC name

>>> molecule = Molecule.from_iupac('4-[(4-methylpiperazin-1-yl)methyl]-N-(4-methyl-3-{[4-(pyridin-3-yl)pyrimidin-2-yl]amino}phenyl)benzamide')

Create a molecule from a common name

>>> molecule = Molecule.from_iupac('imatinib')
classmethod from_json(serialized)

Instantiate an object from a JSON serialized representation.

Specification: https://www.json.org/

Parameters
serializedstr

A JSON serialized representation of the object

Returns
instancecls

An instantiated object

classmethod from_mapped_smiles(mapped_smiles, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, allow_undefined_stereo=False)

Create an Molecule from a mapped SMILES made with cmiles. The molecule will be in the order of the indexing in the mapped smiles string.

Warning

This API is experimental and subject to change.

Parameters
mapped_smiles: str,

A CMILES-style mapped smiles string with explicit hydrogens.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

allow_undefined_stereobool, default=False

If false, raises an exception if oemol contains undefined stereochemistry.

Returns
offmolopenff.toolkit.topology.molecule.Molecule

An OpenFF molecule instance.

Raises
SmilesParsingError

If the given SMILES had no indexing picked up by the toolkits.

classmethod from_messagepack(serialized)

Instantiate an object from a MessagePack serialized representation.

Specification: https://msgpack.org/index.html

Parameters
serializedbytes

A MessagePack-encoded bytes serialized representation

Returns
instancecls

Instantiated object.

classmethod from_openeye(oemol, allow_undefined_stereo=False)

Create a Molecule from an OpenEye molecule.

Requires the OpenEye toolkit to be installed.

Parameters
oemolopeneye.oechem.OEMol

An OpenEye molecule

allow_undefined_stereobool, default=False

If False, raises an exception if oemol contains undefined stereochemistry.

Returns
moleculeopenff.toolkit.topology.Molecule

An OpenFF molecule

Examples

Create a Molecule from an OpenEye OEMol

>>> from openeye import oechem
>>> from openff.toolkit.tests.utils import get_data_file_path
>>> ifs = oechem.oemolistream(get_data_file_path('systems/monomers/ethanol.mol2'))
>>> oemols = list(ifs.GetOEGraphMols())
>>> molecule = Molecule.from_openeye(oemols[0])
classmethod from_pdb_and_smiles(file_path, smiles, allow_undefined_stereo=False)

Create a Molecule from a pdb file and a SMILES string using RDKit.

Requires RDKit to be installed.

Warning

This API is experimental and subject to change.

The molecule is created and sanitised based on the SMILES string, we then find a mapping between this molecule and one from the PDB based only on atomic number and connections. The SMILES molecule is then reindexed to match the PDB, the conformer is attached, and the molecule returned.

Note that any stereochemistry in the molecule is set by the SMILES, and not the coordinates of the PDB.

Parameters
file_path: str

PDB file path

smilesstr

a valid smiles string for the pdb, used for stereochemistry, formal charges, and bond order

allow_undefined_stereobool, default=False

If false, raises an exception if SMILES contains undefined stereochemistry.

Returns
moleculeopenff.toolkit.Molecule

An OFFMol instance with ordering the same as used in the PDB file.

Raises
InvalidConformerError

If the SMILES and PDB molecules are not isomorphic.

classmethod from_pickle(serialized)

Instantiate an object from a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Parameters
serializedstr

A pickled representation of the object

Returns
instancecls

An instantiated object

classmethod from_qcschema(qca_record, client=None, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, allow_undefined_stereo=False)

Create a Molecule from a QCArchive molecule record or dataset entry based on attached cmiles information.

For a molecule record, a conformer will be set from its geometry.

For a dataset entry, if a corresponding client instance is provided, the starting geometry for that entry will be used as a conformer.

A QCElemental Molecule produced from Molecule.to_qcschema can be round-tripped through this method to produce a new, valid Molecule.

Parameters
qca_recorddict

A QCArchive molecule record or dataset entry.

clientoptional, default=None,

A qcportal.FractalClient instance to use for fetching an initial geometry. Only used if qca_record is a dataset entry.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

allow_undefined_stereobool, default=False

If false, raises an exception if qca_record contains undefined stereochemistry.

Returns
moleculeopenff.toolkit.topology.Molecule

An OpenFF molecule instance.

Raises
AttributeError
  • If the record dict can not be made from qca_record.

  • If a client is passed and it could not retrieve the initial molecule.

KeyError

If the dict does not contain the canonical_isomeric_explicit_hydrogen_mapped_smiles.

InvalidConformerError

Silent error, if the conformer could not be attached.

Examples

Get Molecule from a QCArchive molecule record:

>>> from qcportal import FractalClient
>>> client = FractalClient()
>>> offmol = Molecule.from_qcschema(client.query_molecules(molecular_formula="C16H20N3O5")[0])

Get Molecule from a QCArchive optimization entry:

>>> from qcportal import FractalClient
>>> client = FractalClient()
>>> optds = client.get_collection("OptimizationDataset",
                                  "SMIRNOFF Coverage Set 1")
>>> offmol = Molecule.from_qcschema(optds.get_entry('coc(o)oc-0'))

Same as above, but with conformer(s) from initial molecule(s) by providing client to database:

>>> offmol = Molecule.from_qcschema(optds.get_entry('coc(o)oc-0'), client=client)
classmethod from_rdkit(rdmol, allow_undefined_stereo=False, hydrogens_are_explicit=False)

Create a Molecule from an RDKit molecule.

Requires the RDKit to be installed.

Parameters
rdmolrkit.RDMol

An RDKit molecule

allow_undefined_stereobool, default=False

If False, raises an exception if rdmol contains undefined stereochemistry.

hydrogens_are_explicitbool, default=False

If False, RDKit will perform hydrogen addition using Chem.AddHs

Returns
moleculeopenff.toolkit.topology.Molecule

An OpenFF molecule

Examples

Create a molecule from an RDKit molecule

>>> from rdkit import Chem
>>> from openff.toolkit.tests.utils import get_data_file_path
>>> rdmol = Chem.MolFromMolFile(get_data_file_path('systems/monomers/ethanol.sdf'))
>>> molecule = Molecule.from_rdkit(rdmol)
classmethod from_smiles(smiles, hydrogens_are_explicit=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, allow_undefined_stereo=False)

Construct a Molecule from a SMILES representation

Parameters
smilesstr

The SMILES representation of the molecule.

hydrogens_are_explicitbool, default = False

If False, the cheminformatics toolkit will perform hydrogen addition

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

allow_undefined_stereobool, default=False

Whether to accept SMILES with undefined stereochemistry. If False, an exception will be raised if a SMILES with undefined stereochemistry is passed into this function.

Returns
moleculeopenff.toolkit.topology.Molecule

Examples

>>> molecule = Molecule.from_smiles('Cc1ccccc1')
classmethod from_toml(serialized)

Instantiate an object from a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Parameters
serlializedstr

A TOML serialized representation of the object

Returns
instancecls

An instantiated object

classmethod from_topology(topology)

Return a Molecule representation of an OpenFF Topology containing a single Molecule object.

Parameters
topologyopenff.toolkit.topology.Topology

The Topology object containing a single Molecule object. Note that OpenMM and MDTraj Topology objects are not supported.

Returns
moleculeopenff.toolkit.topology.Molecule

The Molecule object in the topology

Raises
ValueError

If the topology does not contain exactly one molecule.

Examples

Create a molecule from a Topology object that contains exactly one molecule

>>> molecule = Molecule.from_topology(topology)  
classmethod from_xml(serialized)

Instantiate an object from an XML serialized representation.

Specification: https://www.w3.org/XML/

Parameters
serializedbytes

An XML serialized representation

Returns
instancecls

Instantiated object.

classmethod from_yaml(serialized)

Instantiate from a YAML serialized representation.

Specification: http://yaml.org/

Parameters
serializedstr

A YAML serialized representation of the object

Returns
instancecls

Instantiated object

generate_conformers(toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit, n_conformers=10, rms_cutoff=None, clear_existing=True)

Generate conformers for this molecule using an underlying toolkit.

If n_conformers=0, no toolkit wrapper will be called. If n_conformers=0 and clear_existing=True, molecule.conformers will be set to None.

Parameters
toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES-to-molecule conversion

n_conformersint, default=1

The maximum number of conformers to produce

rms_cutoffopenmm.unit.Quantity-wrapped float, in units of distance, optional, default=None

The minimum RMS value at which two conformers are considered redundant and one is deleted. Precise implementation of this cutoff may be toolkit-dependent. If None, the cutoff is set to be the default value for each ToolkitWrapper (generally 1 Angstrom).

clear_existingbool, default=True

Whether to overwrite existing conformers for the molecule

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

Examples

>>> molecule = Molecule.from_smiles('CCCCCC')
>>> molecule.generate_conformers()
generate_unique_atom_names()

Generate unique atom names using element name and number of times that element has occurred e.g. ‘C1x’, ‘H1x’, ‘O1x’, ‘C2x’, …

The character ‘x’ is appended to these generated names to reduce the odds that they clash with an atom name or type imported from another source.

get_bond_between(i, j)

Returns the bond between two atoms

Parameters
i, jint or Atom

Atoms or atom indices to check

Returns
bondBond

The bond between i and j.

property has_unique_atom_names

True if the molecule has unique atom names, False otherwise.

property hill_formula

Get the Hill formula of the molecule

property impropers

Iterate over all improper torsions in the molecule.

Returns
impropersset of tuple

An iterator of tuples, each containing the indices of atoms making up a possible improper torsion.

is_isomorphic_with(other, **kwargs)

Check if the molecule is isomorphic with the other molecule which can be an openff.toolkit.topology.Molecule, or TopologyMolecule or nx.Graph(). Full matching is done using the options described bellow.

Warning

This API is experimental and subject to change.

Parameters
other: openff.toolkit.topology.Molecule or TopologyMolecule or nx.Graph()
return_atom_map: bool, default=False, optional

will return an optional dict containing the atomic mapping.

aromatic_matching: bool, default=True, optional
compare the aromatic attributes of bonds and atoms.
formal_charge_matching: bool, default=True, optional
compare the formal charges attributes of the atoms.
bond_order_matching: bool, deafult=True, optional
compare the bond order on attributes of the bonds.
atom_stereochemistry_matchingbool, default=True, optional

If False, atoms’ stereochemistry is ignored for the purpose of determining equality.

bond_stereochemistry_matchingbool, default=True, optional

If False, bonds’ stereochemistry is ignored for the purpose of determining equality.

strip_pyrimidal_n_atom_stereo: bool, default=True, optional

If True, any stereochemistry defined around pyrimidal nitrogen stereocenters will be disregarded in the isomorphism check.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for removing stereochemistry from pyrimidal nitrogens.

Returns
isomorphicbool
property n_angles

int: number of angles in the Molecule.

property n_atoms

The number of Atom objects.

property n_bonds

The number of Bond objects.

property n_conformers

Returns the number of conformers for this molecule.

property n_impropers

int: number of possible improper torsions in the Molecule.

property n_particles

The number of Particle objects, which corresponds to how many positions must be used.

property n_propers

int: number of proper torsions in the Molecule.

property n_rings

Return the number of rings found in the Molecule

Requires the RDKit to be installed.

Note

For systems containing some special cases of connected rings, this function may not be well-behaved and may report a different number rings than expected. Some problematic cases include networks of many (5+) rings or bicyclic moieties (i.e. norbornane).

property n_virtual_particles

The number of VirtualParticle objects.

property n_virtual_sites

The number of VirtualSite objects.

property name

The name (or title) of the molecule

nth_degree_neighbors(n_degrees)

Return canonicalized pairs of atoms whose shortest separation is exactly n bonds. Only pairs with increasing atom indices are returned.

Parameters
n: int

The number of bonds separating atoms in each pair

Returns
neighbors: iterator of tuple of Atom

Tuples (len 2) of atom that are separated by n bonds.

Notes

The criteria used here relies on minimum distances; when there are multiple valid paths between atoms, such as atoms in rings, the shortest path is considered. For example, two atoms in “meta” positions with respect to each other in a benzene are separated by two paths, one length 2 bonds and the other length 4 bonds. This function would consider them to be 2 apart and would not include them if n=4 was passed.

property partial_charges

Returns the partial charges (if present) on the molecule.

Returns
partial_chargesa openmm.unit.Quantity - wrapped numpy array [1 x n_atoms] or None

The partial charges on this Molecule’s atoms. Returns None if no charges have been specified.

property particles

Iterate over all Particle objects.

property propers

Iterate over all proper torsions in the molecule

property properties

The properties dictionary of the molecule

remap(mapping_dict, current_to_new=True)

Remap all of the indexes in the molecule to match the given mapping dict

Warning

This API is experimental and subject to change.

Parameters
mapping_dictdict,

A dictionary of the mapping between indexes, this should start from 0.

current_to_newbool, default=True

If this is True, then mapping_dict is of the form {current_index: new_index}; otherwise, it is of the form {new_index: current_index}

Returns
new_moleculeopenff.toolkit.topology.molecule.Molecule

An openff.toolkit.Molecule instance with all attributes transferred, in the PDB order.

property rings

Return the number of rings in this molecule.

Requires the RDKit to be installed.

Note

For systems containing some special cases of connected rings, this function may not be well-behaved and may report a different number rings than expected. Some problematic cases include networks of many (5+) rings or bicyclic moieties (i.e. norbornane).

property smirnoff_impropers

Iterate over improper torsions in the molecule, but only those with trivalent centers, reporting the central atom second in each improper.

Note that it’s possible that a trivalent center will not have an improper assigned. This will depend on the force field that is used.

Also note that this will return 6 possible atom orderings around each improper center. In current SMIRNOFF parameterization, three of these six orderings will be used for the actual assignment of the improper term and measurement of the angles. These three orderings capture the three unique angles that could be calculated around the improper center, therefore the sum of these three terms will always return a consistent energy.

The exact three orderings that will be applied during parameterization can not be determined in this method, since it requires sorting the particle indices, and those indices may change when this molecule is added to a Topology.

For more details on the use of three-fold (‘trefoil’) impropers, see https://open-forcefield-toolkit.readthedocs.io/en/latest/smirnoff.html#impropertorsions

Returns
impropersset of tuple

An iterator of tuples, each containing the indices of atoms making up a possible improper torsion. The central atom is listed second in each tuple.

strip_atom_stereochemistry(smarts, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Delete stereochemistry information for certain atoms, if it is present. This method can be used to “normalize” molecules imported from different cheminformatics toolkits, which differ in which atom centers are considered stereogenic.

Parameters
smarts: str or ChemicalEnvironment

Tagged SMARTS with a single atom with index 1. Any matches for this atom will have any assigned stereocheistry information removed.

toolkit_registrya ToolkitRegistry or ToolkitWrapper object, optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for I/O operations

to_bson()

Return a BSON serialized representation.

Specification: http://bsonspec.org/

Returns
serializedbytes

A BSON serialized representation of the objecft

to_dict()

Return a dictionary representation of the molecule.

Returns
molecule_dictOrderedDict

A dictionary representation of the molecule.

to_file(file_path, file_format, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Write the current molecule to a file or file-like object

Parameters
file_pathstr or file-like object

A file-like object or the path to the file to be written.

file_formatstr

Format specifier, one of [‘MOL2’, ‘MOL2H’, ‘SDF’, ‘PDB’, ‘SMI’, ‘CAN’, ‘TDT’] Note that not all toolkits support all formats

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper,
optional, default=GLOBAL_TOOLKIT_REGISTRY

ToolkitRegistry or ToolkitWrapper to use for file writing. If a Toolkit is passed, only the highest-precedence toolkit is used

Raises
ValueError

If the requested file_format is not supported by one of the installed cheminformatics toolkits

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> molecule.to_file('imatinib.mol2', file_format='mol2')  
>>> molecule.to_file('imatinib.sdf', file_format='sdf')  
>>> molecule.to_file('imatinib.pdb', file_format='pdb')  
static to_hill_formula(molecule)

Generate the Hill formula from either a FrozenMolecule, TopologyMolecule or nx.Graph() of the molecule

Parameters
moleculeFrozenMolecule, TopologyMolecule or nx.Graph()
Returns
formulathe Hill formula of the molecule
Raises
NotImplementedErrorif the molecule is not of one of the specified types.
to_inchi(fixed_hydrogens=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Create an InChI string for the molecule using the requested toolkit backend. InChI is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters
fixed_hydrogens: bool, default=False

If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

toolkit_registryopenff.toolkit.utils.toolkits.ToolRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for molecule-to-InChI conversion

Returns
inchi: str

The InChI string of the molecule.

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

to_inchikey(fixed_hydrogens=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Create an InChIKey for the molecule using the requested toolkit backend. InChIKey is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters
fixed_hydrogens: bool, default=False

If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

toolkit_registryopenff.toolkit.utils.toolkits.ToolRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for molecule-to-InChIKey conversion

Returns
inchi_key: str

The InChIKey representation of the molecule.

Raises
InvalidToolkitRegistryError

If an invalid object is passed as the toolkit_registry parameter

to_iupac(toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Generate IUPAC name from Molecule

Returns
iupac_namestr

IUPAC name of the molecule

Note

This method requires the OpenEye toolkit to be installed. ..

Examples

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> iupac_name = molecule.to_iupac()
to_json(indent=None)

Return a JSON serialized representation.

Specification: https://www.json.org/

Parameters
indentint, optional, default=None

If not None, will pretty-print with specified number of spaces for indentation

Returns
serializedstr

A JSON serialized representation of the object

to_messagepack()

Return a MessagePack representation.

Specification: https://msgpack.org/index.html

Returns
serializedbytes

A MessagePack-encoded bytes serialized representation of the object

to_networkx()

Generate a NetworkX undirected graph from the Molecule.

Nodes are Atoms labeled with particle indices and atomic elements (via the element node atrribute). Edges denote chemical bonds between Atoms. Virtual sites are not included, since they lack a concept of chemical connectivity.

Returns
graphnetworkx.Graph

The resulting graph, with nodes (atoms) labeled with atom indices, elements, stereochemistry and aromaticity flags and bonds with two atom indices, bond order, stereochemistry, and aromaticity flags

Examples

Retrieve the bond graph for imatinib (OpenEye toolkit required)

>>> molecule = Molecule.from_iupac('imatinib')
>>> nxgraph = molecule.to_networkx()
to_openeye(aromaticity_model='OEAroModel_MDL')

Create an OpenEye molecule

Requires the OpenEye toolkit to be installed.

Parameters
aromaticity_modelstr, optional, default=DEFAULT_AROMATICITY_MODEL

The aromaticity model to use

Returns
oemolopeneye.oechem.OEMol

An OpenEye molecule

Examples

Create an OpenEye molecule from a Molecule

>>> molecule = Molecule.from_smiles('CC')
>>> oemol = molecule.to_openeye()
to_pickle()

Return a pickle serialized representation.

Warning

This is not recommended for safe, stable storage since the pickle specification may change between Python versions.

Returns
serializedstr

A pickled representation of the object

to_qcschema(multiplicity=1, conformer=0, extras=None)

Create a QCElemental Molecule.

Warning

This API is experimental and subject to change.

Parameters
multiplicityint, default=1,

The multiplicity of the molecule; sets molecular_multiplicity field for QCElemental Molecule.

conformerint, default=0,

The index of the conformer to use for the QCElemental Molecule geometry.

extrasdict, default=None

A dictionary that should be included in the extras field on the QCElemental Molecule. This can be used to include extra information, such as a smiles representation.

Returns
qcelemental.models.Molecule

A validated QCElemental Molecule.

Raises
MissingDependencyError

If qcelemental is not installed, the qcschema can not be validated.

InvalidConformerError

No conformer found at the given index.

Examples

Create a QCElemental Molecule:

>>> import qcelemental as qcel
>>> mol = Molecule.from_smiles('CC')
>>> mol.generate_conformers(n_conformers=1)
>>> qcemol = mol.to_qcschema()
to_rdkit(aromaticity_model='OEAroModel_MDL')

Create an RDKit molecule

Requires the RDKit to be installed.

Parameters
aromaticity_modelstr, optional, default=DEFAULT_AROMATICITY_MODEL

The aromaticity model to use

Returns
rdmolrdkit.RDMol

An RDKit molecule

Examples

Convert a molecule to RDKit

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> rdmol = molecule.to_rdkit()
to_smiles(isomeric=True, explicit_hydrogens=True, mapped=False, toolkit_registry=ToolkitRegistry containing The RDKit, AmberTools, Built-in Toolkit)

Return a canonical isomeric SMILES representation of the current molecule. A partially mapped smiles can also be generated for atoms of interest by supplying an atom_map to the properties dictionary.

Note

RDKit and OpenEye versions will not necessarily return the same representation.

Parameters
isomeric: bool optional, default= True

return an isomeric smiles

explicit_hydrogens: bool optional, default=True

return a smiles string containing all hydrogens explicitly

mapped: bool optional, default=False

return a explicit hydrogen mapped smiles, the atoms to be mapped can be controlled by supplying an atom map into the properties dictionary. If no mapping is passed all atoms will be mapped in order, else an atom map dictionary from the current atom index to the map id should be supplied with no duplicates. The map ids (values) should start from 0 or 1.

toolkit_registryopenff.toolkit.utils.toolkits.ToolkitRegistry or openff.toolkit.utils.toolkits.ToolkitWrapper, optional, default=None

ToolkitRegistry or ToolkitWrapper to use for SMILES conversion

Returns
smilesstr

Canonical isomeric explicit-hydrogen SMILES

Examples

>>> from openff.toolkit.utils import get_data_file_path
>>> sdf_filepath = get_data_file_path('molecules/ethanol.sdf')
>>> molecule = Molecule(sdf_filepath)
>>> smiles = molecule.to_smiles()
to_toml()

Return a TOML serialized representation.

Specification: https://github.com/toml-lang/toml

Returns
serializedstr

A TOML serialized representation of the object

to_topology()

Return an OpenFF Topology representation containing one copy of this molecule

Returns
topologyopenff.toolkit.topology.Topology

A Topology representation of this molecule

Examples

>>> molecule = Molecule.from_iupac('imatinib')
>>> topology = molecule.to_topology()
to_xml(indent=2)

Return an XML representation.

Specification: https://www.w3.org/XML/

Parameters
indentint, optional, default=2

If not None, will pretty-print with specified number of spaces for indentation

Returns
serializedbytes

A MessagePack-encoded bytes serialized representation.

to_yaml()

Return a YAML serialized representation.

Specification: http://yaml.org/

Returns
serializedstr

A YAML serialized representation of the object

property torsions

Get an iterator over all i-j-k-l torsions. Note that i-j-k-i torsions (cycles) are excluded.

Returns
torsionsiterable of 4-Atom tuples
property total_charge

Return the total charge on the molecule

property virtual_sites

Iterate over all VirtualSite objects.

add_atom(atomic_number, formal_charge, is_aromatic, stereochemistry=None, name=None)[source]

Add an atom

Parameters
atomic_numberint

Atomic number of the atom

formal_chargeint

Formal charge of the atom

is_aromaticbool

If True, atom is aromatic; if False, not aromatic

stereochemistrystr, optional, default=None

Either 'R' or 'S' for specified stereochemistry, or None if stereochemistry is irrelevant

namestr, optional

An optional name for the atom

Returns
indexint

The index of the atom in the molecule

Examples

Define a methane molecule

>>> molecule = Molecule()
>>> molecule.name = 'methane'
>>> C = molecule.add_atom(6, 0, False)
>>> H1 = molecule.add_atom(1, 0, False)
>>> H2 = molecule.add_atom(1, 0, False)
>>> H3 = molecule.add_atom(1, 0, False)
>>> H4 = molecule.add_atom(1, 0, False)
>>> bond_idx = molecule.add_bond(C, H1, False, 1)
>>> bond_idx = molecule.add_bond(C, H2, False, 1)
>>> bond_idx = molecule.add_bond(C, H3, False, 1)
>>> bond_idx = molecule.add_bond(C, H4, False, 1)
add_bond_charge_virtual_site(atoms, distance, **kwargs)[source]

Add a virtual site representing the charge on a bond.

Create a bond charge-type virtual site, in which the location of the charge is specified by the position of two atoms. This supports placement of a virtual site \(S\) along a vector between two specified atoms, e.g. to allow for a sigma hole for halogens or similar contexts. With positive values of the distance, the virtual site lies outside the first indexed atom.

Parameters
atomslist of openff.toolkit.topology.molecule.Atom objects

The atoms defining the virtual site’s position

distanceopenmm.unit.Quantity of dimension [Length] wrapping a scalar
charge_incrementslist of floats of shape [N], optional, default=None

The amount of charge to remove from the VirtualSite’s atoms and put in the VirtualSite. Indexing in this list should match the ordering in the atoms list. Default is None.

epsilonfloat

Epsilon term for VdW properties of virtual site. Default is None.

sigmafloat, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_halffloat

Rmin_half term for VdW properties of virtual site. Default is None.

namestring or None, default=’’

The name of this virtual site. Default is ‘’.

symmetricbool, default=True

Whether to make virtual site symmetric by creating two particles instead of just one. As an example, for N_2 this should be set to True to model both lone pairs with the same parameters.

Returns
indexint

The index of the newly-added virtual site in the molecule

add_monovalent_lone_pair_virtual_site(atoms, distance, out_of_plane_angle, in_plane_angle, **kwargs)[source]

Create a bond charge-type virtual site, in which the location of the charge is specified by the position of three atoms.

Parameters
atomslist of three openff.toolkit.topology.molecule.Atom objects

The three atoms defining the virtual site’s position

distanceopenmm.unit.Quantity of dimension [Length] wrapping a scalar
out_of_plane_angleopenmm.unit.Quantity of dimension [Angle] wrapping
a scalar
in_plane_angleopenmm.unit.Quantity of dimension [Angle] wrapping a
scalar
epsilonfloat

Epsilon term for VdW properties of virtual site. Default is None.

sigmafloat, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_halffloat

Rmin_half term for VdW properties of virtual site. Default is None.

namestring or None, default=’’

The name of this virtual site. Default is ‘’.

symmetricbool, default=False

Whether to make virtual site symmetric by creating two particles instead of just one. Note that because this site is defined is placed on the noncentral atom, setting this to True will place one particle on atom1, and the other on atom3.

Returns
indexint

The index of the newly-added virtual site in the molecule

add_divalent_lone_pair_virtual_site(atoms, distance, out_of_plane_angle, **kwargs)[source]

Create a divalent lone pair-type virtual site, in which the location of the charge is specified by the position of three atoms.

Parameters
atomslist of three openff.toolkit.topology.molecule.Atom objects

The three atoms defining the virtual site’s position

distanceopenmm.unit.Quantity of dimension [Length] wrapping a scalar
out_of_plane_angleopenmm.unit.Quantity of dimension [Angle] wrapping
a scalar
epsilonfloat

Epsilon term for VdW properties of virtual site. Default is None.

sigmafloat, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_halffloat

Rmin_half term for VdW properties of virtual site. Default is None.

namestring or None, default=’’

The name of this virtual site. Default is ‘’.

symmetricbool, default=True

Whether to make virtual site symmetric by creating two particles instead of just one. As an example, for TIP5 should be set to True to model both lone pairs with the same parameters.

Returns
indexint

The index of the newly-added virtual site in the molecule

add_trivalent_lone_pair_virtual_site(atoms, distance, **kwargs)[source]

Create a trivalent lone pair-type virtual site, in which the location of the charge is specified by the position of four atoms.

Parameters
atomslist of four openff.toolkit.topology.molecule.Atom objects

The four atoms defining the virtual site’s position

distanceopenmm.unit.Quantity of dimension [Length] wrapping a scalar
epsilonfloat

Epsilon term for VdW properties of virtual site. Default is None.

sigmafloat, default=None

Sigma term for VdW properties of virtual site. Default is None.

rmin_halffloat

Rmin_half term for VdW properties of virtual site. Default is None.

namestring or None, default=’’

The name of this virtual site. Default is ‘’.

Returns
indexint

The index of the newly-added virtual site in the molecule

add_bond(atom1, atom2, bond_order, is_aromatic, stereochemistry=None, fractional_bond_order=None)[source]

Add a bond between two specified atom indices

Parameters
atom1int or openff.toolkit.topology.molecule.Atom

Index of first atom

atom2int or openff.toolkit.topology.molecule.Atom

Index of second atom

bond_orderint

Integral bond order of Kekulized form

is_aromaticbool

True if this bond is aromatic, False otherwise

stereochemistrystr, optional, default=None

Either 'E' or 'Z' for specified stereochemistry, or None if stereochemistry is irrelevant

fractional_bond_orderfloat, optional, default=None

The fractional (eg. Wiberg) bond order

Returns
index: int

Index of the bond in this molecule

add_conformer(coordinates)[source]

Add a conformation of the molecule

Parameters
coordinates: openmm.unit.Quantity(np.array) with shape (n_atoms, 3) and dimension of distance

Coordinates of the new conformer, with the first dimension of the array corresponding to the atom index in the Molecule’s indexing system.

Returns
index: int

The index of this conformer

visualize(backend='rdkit', width=None, height=None, show_all_hydrogens=True)[source]

Render a visualization of the molecule in Jupyter

Parameters
backendstr, optional, default=’rdkit’

Which visualization engine to use. Choose from:

  • rdkit

  • openeye

  • nglview (conformers needed)

widthint, optional, default=500

Width of the generated representation (only applicable to backend=openeye or backend=rdkit)

heightint, optional, default=300

Width of the generated representation (only applicable to backend=openeye or backend=rdkit)

show_all_hydrogensbool, optional, default=True

Whether to explicitly depict all hydrogen atoms. (only applicable to backend=openeye or backend=rdkit)

Returns
object

Depending on the backend chosen:

  • rdkit → IPython.display.SVG

  • openeye → IPython.display.Image

  • nglview → nglview.NGLWidget