RDKitToolkitWrapper

class openff.toolkit.utils.toolkits.RDKitToolkitWrapper[source]

RDKit toolkit wrapper

Warning

This API is experimental and subject to change.

__init__()[source]

Methods

__init__()

apply_elf_conformer_selection(molecule[, ...])

Applies the ELF method to select a set of diverse conformers which have minimal electrostatically strongly interacting functional groups from a molecules conformers.

assign_partial_charges(molecule[, ...])

Compute partial charges with RDKit, and assign the new values to the partial_charges attribute.

atom_is_in_ring(atom)

Return whether or not an atom is in a ring.

bond_is_in_ring(bond)

Return whether or not a bond is in a ring.

canonical_order_atoms(molecule)

Canonical order the atoms in the molecule using the RDKit.

enumerate_stereoisomers(molecule[, ...])

Enumerate the stereocenters and bonds of the current molecule.

enumerate_tautomers(molecule[, max_states])

Enumerate the possible tautomers of the current molecule.

find_smarts_matches(molecule, smarts[, ...])

Find all SMARTS matches for the specified molecule, using the specified aromaticity model.

from_file(file_path, file_format[, ...])

Create an openff.toolkit.topology.Molecule from a file using this toolkit.

from_file_obj(file_obj, file_format[, ...])

Return an openff.toolkit.topology.Molecule from a file-like object using this toolkit.

from_inchi(inchi[, allow_undefined_stereo, ...])

Construct a Molecule from a InChI representation

from_object(obj[, allow_undefined_stereo, _cls])

If given an rdchem.Mol (or rdchem.Mol-derived object), this function will load it into an openff.toolkit.topology.molecule.

from_pdb_and_smiles(file_path, smiles[, ...])

Create a Molecule from a pdb file and a SMILES string using RDKit.

from_rdkit(rdmol[, allow_undefined_stereo, ...])

Create a Molecule from an RDKit molecule.

from_smiles(smiles[, ...])

Create a Molecule from a SMILES string using the RDKit toolkit.

generate_conformers(molecule[, ...])

Generate molecule conformers using RDKit.

get_tagged_smarts_connectivity(smarts)

Returns a tuple of tuples indicating connectivity between tagged atoms in a SMARTS string.

is_available()

Check whether the RDKit toolkit can be imported

requires_toolkit()

to_file(molecule, file_path, file_format)

Writes an OpenFF Molecule to a file-like object

to_file_obj(molecule, file_obj, file_format)

Writes an OpenFF Molecule to a file-like object

to_inchi(molecule[, fixed_hydrogens])

Create an InChI string for the molecule using the RDKit Toolkit.

to_inchikey(molecule[, fixed_hydrogens])

Create an InChIKey for the molecule using the RDKit Toolkit.

to_rdkit(molecule[, aromaticity_model])

Create an RDKit molecule Requires the RDKit to be installed.

to_smiles(molecule[, isomeric, ...])

Uses the RDKit toolkit to convert a Molecule into a SMILES string.

Attributes

SUPPORTED_CHARGE_METHODS

supported_charge_methods

toolkit_file_read_formats

List of file formats that this toolkit can read.

toolkit_file_write_formats

List of file formats that this toolkit can write.

toolkit_installation_instructions

Instructions on how to install the wrapped toolkit.

toolkit_name

Return the name of the toolkit wrapped by this class as a str

toolkit_version

Return the version of the wrapped toolkit as a str

property toolkit_file_write_formats: list[str]

List of file formats that this toolkit can write.

classmethod is_available() bool[source]

Check whether the RDKit toolkit can be imported

Returns:

is_installed – True if RDKit is installed, False otherwise.

from_object(obj, allow_undefined_stereo: bool = False, _cls=None)[source]

If given an rdchem.Mol (or rdchem.Mol-derived object), this function will load it into an openff.toolkit.topology.molecule. Otherwise, it will return False.

Parameters:
  • obj – An object to be type-checked and converted into a Molecule, if possible.

  • allow_undefined_stereo – Whether to accept molecules with undefined stereocenters. If False, an exception will be raised if a molecule with undefined stereochemistry is passed into this function.

  • _cls – Molecule constructor

Returns:

Molecule or False – An openff.toolkit.topology.molecule Molecule.

Raises:

NotImplementedError – If the object could not be converted into a Molecule.

from_pdb_and_smiles(file_path: str, smiles: str, allow_undefined_stereo: bool = False, _cls=None, name: str = '')[source]

Create a Molecule from a pdb file and a SMILES string using RDKit.

Requires RDKit to be installed.

The molecule is created and sanitised based on the SMILES string, we then find a mapping between this molecule and one from the PDB based only on atomic number and connections. The SMILES molecule is then reindexed to match the PDB, the conformer is attached, and the molecule returned.

Note that any stereochemistry in the molecule is set by the SMILES, and not the coordinates of the PDB.

Parameters:
  • file_path – PDB file path

  • smiles – a valid smiles string for the pdb, used for stereochemistry, formal charges, and bond order

  • allow_undefined_stereo – If false, raises an exception if SMILES contains undefined stereochemistry.

  • _cls – Molecule constructor

  • name – An optional name for the output molecule

Returns:

molecule – An OFFMol instance with ordering the same as used in the PDB file.

Raises:

InvalidConformerError

from_file(file_path: str, file_format: str, allow_undefined_stereo: bool = False, _cls=None)[source]

Create an openff.toolkit.topology.Molecule from a file using this toolkit.

Parameters:
  • file_path – The file to read the molecule from

  • file_format – Format specifier, usually file suffix (eg. ‘MOL2’, ‘SMI’) Note that not all toolkits support all formats. Check ToolkitWrapper.toolkit_file_read_formats for details.

  • allow_undefined_stereo – If false, raises an exception if RDMol contains undefined stereochemistry.

  • _cls – Molecule constructor

Returns:

molecules – a list of Molecule objects is returned.

from_file_obj(file_obj, file_format: str, allow_undefined_stereo: bool = False, _cls=None)[source]

Return an openff.toolkit.topology.Molecule from a file-like object using this toolkit.

A file-like object is an object with a “.read()” method.

Warning

This API is experimental and subject to change.

Parameters:
  • file_obj – The file-like object to read the molecule from

  • file_format – Format specifier, usually file suffix (eg. ‘MOL2’, ‘SMI’) Note that not all toolkits support all formats. Check ToolkitWrapper.toolkit_file_read_formats for details.

  • allow_undefined_stereo – If false, raises an exception if RDMol contains undefined stereochemistry.

  • _cls – Molecule constructor

Returns:

molecules – a list of Molecule objects is returned.

to_file_obj(molecule: Molecule, file_obj, file_format: str)[source]

Writes an OpenFF Molecule to a file-like object

Parameters:
  • molecule – The molecule to write

  • file_obj – The file-like object to write to

  • file_format – The format for writing the molecule data

to_file(molecule: Molecule, file_path: str, file_format: str)[source]

Writes an OpenFF Molecule to a file-like object

Parameters:
  • molecule – The molecule to write

  • file_path – The file path to write to

  • file_format – The format for writing the molecule data

enumerate_stereoisomers(molecule: Molecule, undefined_only: bool = False, max_isomers: int = 20, rationalise: bool = True) list['Molecule'][source]

Enumerate the stereocenters and bonds of the current molecule.

Parameters:
  • molecule – The molecule whose state we should enumerate

  • undefined_only – If we should enumerate all stereocenters and bonds or only those with undefined stereochemistry

  • max_isomers – The maximum amount of molecules that should be returned

  • rationalise – If we should try to build and rationalise the molecule to ensure it can exist

Returns:

molecules – A list of openff.toolkit.topology.Molecule instances

enumerate_tautomers(molecule: Molecule, max_states: int = 20) list['Molecule'][source]

Enumerate the possible tautomers of the current molecule.

Parameters:
  • molecule – The molecule whose state we should enumerate

  • max_states – The maximum amount of molecules that should be returned

Returns:

molecules – A list of openff.toolkit.topology.Molecule instances not including the input molecule.

canonical_order_atoms(molecule: Molecule) Molecule[source]

Canonical order the atoms in the molecule using the RDKit.

Parameters:
  • molecule

    The input molecule

    Returns

  • -------

  • molecule – The input molecule, with canonically-indexed atoms and bonds.

to_smiles(molecule: Molecule, isomeric: bool = True, explicit_hydrogens: bool = True, mapped: bool = False)[source]

Uses the RDKit toolkit to convert a Molecule into a SMILES string. A partially mapped smiles can also be generated for atoms of interest by supplying an atom_map to the properties dictionary.

Parameters:
  • molecule – The molecule to convert into a SMILES.

  • isomeric – return an isomeric smiles

  • explicit_hydrogens – return a smiles string containing all hydrogens explicitly

  • mapped – return a explicit hydrogen mapped smiles, the atoms to be mapped can be controlled by supplying an atom map into the properties dictionary. If no mapping is passed all atoms will be mapped in order, else an atom map dictionary from the current atom index to the map id should be supplied with no duplicates. The map ids (values) should start from 0 or 1.

Returns:

smiles – The SMILES of the input molecule.

from_smiles(smiles: str, hydrogens_are_explicit: bool = False, allow_undefined_stereo: bool = False, _cls=None, name: str = '')[source]

Create a Molecule from a SMILES string using the RDKit toolkit.

Warning

This API is experimental and subject to change.

Parameters:
  • smiles – The SMILES string to turn into a molecule

  • hydrogens_are_explicit – If False, RDKit will perform hydrogen addition using Chem.AddHs

  • allow_undefined_stereo – Whether to accept SMILES with undefined stereochemistry. If False, an exception will be raised if a SMILES with undefined stereochemistry is passed into this function.

  • _cls – Molecule constructor

  • name – An optional name to pass to the _cls constructor

Returns:

molecule – An OpenFF style molecule.

Raises:

RadicalsNotSupportedError

from_inchi(inchi: str, allow_undefined_stereo: bool = False, _cls=None, name: str = '')[source]

Construct a Molecule from a InChI representation

Parameters:
  • inchi – The InChI representation of the molecule.

  • allow_undefined_stereo – Whether to accept InChI with undefined stereochemistry. If False, an exception will be raised if a InChI with undefined stereochemistry is passed into this function.

  • _cls – Molecule constructor

Returns:

molecule

generate_conformers(molecule: Molecule, n_conformers: int = 1, rms_cutoff: Quantity | None = None, clear_existing: bool = True, _cls=None, make_carboxylic_acids_cis: bool = False)[source]

Generate molecule conformers using RDKit.

Warning

This API is experimental and subject to change.

Parameters:
  • molecule – The molecule to generate conformers for.

  • n_conformers – Maximum number of conformers to generate.

  • rms_cutoff – The minimum RMS value at which two conformers are considered redundant and one is deleted. If None, the cutoff is set to 1 Angstrom

  • clear_existing – Whether to overwrite existing conformers for the molecule.

  • _cls – Molecule constructor

  • make_carboxylic_acids_cis – Guarantee all conformers have exclusively cis carboxylic acid groups (COOH) by rotating the proton in any trans carboxylic acids 180 degrees around the C-O bond.

assign_partial_charges(molecule: Molecule, partial_charge_method: str | None = None, use_conformers: list[Quantity] | None = None, strict_n_conformers: bool = False, normalize_partial_charges: bool = True, _cls=None)[source]

Compute partial charges with RDKit, and assign the new values to the partial_charges attribute.

Warning

This API is experimental and subject to change.

Parameters:
  • molecule – Molecule for which partial charges are to be computed

  • partial_charge_method

    The charge model to use. One of [‘mmff94’, ‘gasteiger’]. If None, ‘mmff94’ will be used.

    • ’mmff94’: Applies partial charges using the Merck Molecular Force Field

      (MMFF). This method does not make use of conformers, and hence use_conformers and strict_n_conformers will not impact the partial charges produced.

  • use_conformers – shape (n_atoms, 3) and dimension of distance. Optional, default = None Coordinates to use for partial charge calculation. If None, an appropriate number of conformers will be generated.

  • strict_n_conformers – Whether to raise an exception if an invalid number of conformers is provided for the given charge method. If this is False and an invalid number of conformers is found, a warning will be raised.

  • normalize_partial_charges – Whether to offset partial charges so that they sum to the total formal charge of the molecule. This is used to prevent accumulation of rounding errors when the partial charge generation method has low precision.

  • _cls – Molecule constructor

Raises:
  • ChargeMethodUnavailableError if the requested charge method can not be handled by this toolkit

  • ChargeCalculationError if the charge method is supported by this toolkit, but fails

apply_elf_conformer_selection(molecule: Molecule, percentage: float = 2.0, limit: int = 10, rms_tolerance: Quantity = 0.05 * unit.angstrom)[source]

Applies the ELF method to select a set of diverse conformers which have minimal electrostatically strongly interacting functional groups from a molecules conformers.

The diverse conformer selection is performed by the _elf_select_diverse_conformers function, which attempts to greedily select conformers which are most distinct according to their RMS.

Warning

  • Although this function is inspired by the OpenEye ELF10 method, this implementation may yield slightly different conformers due to potential differences in this and the OE closed source implementation.

Notes

  • The input molecule should have a large set of conformers already generated to select the ELF10 conformers from.

  • The selected conformers will be retained in the molecule.conformers list while unselected conformers will be discarded.

  • Only heavy atoms are included when using the RMS to select diverse conformers.

See also

RDKitToolkitWrapper._elf_select_diverse_conformers

Parameters:
  • molecule – The molecule which contains the set of conformers to select from.

  • percentage – The percentage of conformers with the lowest electrostatic interaction energies to greedily select from.

  • limit – The maximum number of conformers to select.

  • rms_tolerance – Conformers whose RMS is within this amount will be treated as identical and the duplicate discarded.

from_rdkit(rdmol, allow_undefined_stereo: bool = False, hydrogens_are_explicit: bool = False, _cls=None)[source]

Create a Molecule from an RDKit molecule.

Requires the RDKit to be installed.

Warning

This API is experimental and subject to change.

Parameters:
  • rdmol – An RDKit molecule

  • allow_undefined_stereo – If false, raises an exception if rdmol contains undefined stereochemistry.

  • hydrogens_are_explicit – If False, RDKit will perform hydrogen addition using Chem.AddHs

  • _cls – Molecule constructor

Returns:

molecule – An OpenFF molecule

Examples

Create a molecule from an RDKit molecule

>>> from rdkit import Chem
>>> from openff.toolkit._tests.utils import get_data_file_path
>>> rdmol = Chem.MolFromMolFile(get_data_file_path('systems/monomers/ethanol.sdf'))
>>> toolkit_wrapper = RDKitToolkitWrapper()
>>> molecule = toolkit_wrapper.from_rdkit(rdmol)
to_rdkit(molecule: Molecule, aromaticity_model: str = DEFAULT_AROMATICITY_MODEL)[source]

Create an RDKit molecule Requires the RDKit to be installed.

Warning

This API is experimental and subject to change.

Parameters:

aromaticity_model – The aromaticity model to use. Only OEAroModel_MDL is supported.

Returns:

rdmol – An RDKit molecule

Examples

Convert a molecule to RDKit >>> from openff.toolkit import Molecule >>> ethanol = Molecule.from_smiles(‘CCO’) >>> rdmol = ethanol.to_rdkit()

to_inchi(molecule: Molecule, fixed_hydrogens: bool = False)[source]

Create an InChI string for the molecule using the RDKit Toolkit. InChI is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters:
  • molecule – The molecule to convert into a SMILES.

  • fixed_hydrogens – If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

Returns:

inchi – The InChI string of the molecule.

Raises:

EmptyInChiError – If RDKit failed to generate an InChI for the molecule

to_inchikey(molecule: Molecule, fixed_hydrogens: bool = False) str[source]

Create an InChIKey for the molecule using the RDKit Toolkit. InChIKey is a standardised representation that does not capture tautomers unless specified using the fixed hydrogen layer.

For information on InChi see here https://iupac.org/who-we-are/divisions/division-details/inchi/

Parameters:
  • molecule – The molecule to convert into a SMILES.

  • fixed_hydrogens – If a fixed hydrogen layer should be added to the InChI, if True this will produce a non standard specific InChI string of the molecule.

Returns:

inchi_key – The InChIKey representation of the molecule.

Raises:

EmptyInChiError – If RDKit failed to generate an InChI for the molecule

get_tagged_smarts_connectivity(smarts: str)[source]

Returns a tuple of tuples indicating connectivity between tagged atoms in a SMARTS string. Does not return bond order.

Parameters:

smarts – The tagged SMARTS to analyze

Returns:

  • unique_tags – A sorted tuple of all unique tagged atom map indices.

  • tagged_atom_connectivity

    A tuple of tuples, where each inner tuple is a pair of tagged atoms (tag_idx_1, tag_idx_2)

    which are bonded. The inner tuples are ordered smallest-to-largest, and the tuple of tuples is ordered lexically. So the return value for an improper torsion would be ((1, 2), (2, 3), (2, 4)).

Raises:

SMIRKSParsingError – If RDKit was unable to parse the provided smirks/tagged smarts

find_smarts_matches(molecule: Molecule, smarts: str, aromaticity_model: str = 'OEAroModel_MDL', unique: bool = False) list[tuple[int, ...]][source]

Find all SMARTS matches for the specified molecule, using the specified aromaticity model.

Warning

This API is experimental and subject to change.

Parameters:
  • molecule – The molecule for which all specified SMARTS matches are to be located

  • smarts – SMARTS string with optional SMIRKS-style atom tagging

  • aromaticity_model – Molecule is prepared with this aromaticity model prior to querying.

  • unique – If True, only return unique matches. If False, return all matches.

  • : (.. note) – Currently:

  • OEAroModel_MDL (the only supported aromaticity_model is) –

atom_is_in_ring(atom: Atom) bool[source]

Return whether or not an atom is in a ring.

It is assumed that this atom is in molecule.

Parameters:

atom – The molecule containing the atom of interest

Returns:

is_in_ring – Whether or not the atom is in a ring.

Raises:

NotAttachedToMoleculeError

bond_is_in_ring(bond: Bond) bool[source]

Return whether or not a bond is in a ring.

It is assumed that this atom is in molecule.

Parameters:

bond – The molecule containing the atom of interest

Returns:

is_in_ring – Whether or not the bond of index bond_index is in a ring

Raises:

NotAttachedToMoleculeError

property toolkit_file_read_formats

List of file formats that this toolkit can read.

property toolkit_installation_instructions

Instructions on how to install the wrapped toolkit.

property toolkit_name

Return the name of the toolkit wrapped by this class as a str

Warning

This API is experimental and subject to change.

Returns:

toolkit_name – The name of the wrapped toolkit

property toolkit_version

Return the version of the wrapped toolkit as a str

Warning

This API is experimental and subject to change.

Returns:

toolkit_version – The version of the wrapped toolkit