DGLMoleculeDataset

class openff.nagl.nn.DGLMoleculeDataset(entries: Tuple[DGLMoleculeDatasetEntry, ...] = tuple())[source]

Bases: Dataset

Methods

from_arrow_dataset

from_openff

to_pyarrow

Convert the dataset to a Pyarrow table.

Attributes

classmethod from_arrow_dataset(path: Path, format: str = 'parquet', atom_features: List[AtomFeature] | None = None, bond_features: List[BondFeature] | None = None, atom_feature_column: str | None = None, bond_feature_column: str | None = None, smiles_column: str = 'mapped_smiles', columns: List[str] | None = None, n_processes: int = 0)[source]
classmethod from_openff(molecules: Iterable[Molecule], atom_features: List[AtomFeature] | None = None, bond_features: List[BondFeature] | None = None, atom_feature_tensors: List[Tensor] | None = None, bond_feature_tensors: List[Tensor] | None = None, labels: List[Dict[str, Any]] | None = None, label_function: Callable[[Molecule], Dict[str, Any]] | None = None)[source]
property n_atom_features: int
to_pyarrow(atom_feature_column: str = 'atom_features', bond_feature_column: str = 'bond_features', smiles_column: str = 'mapped_smiles')[source]

Convert the dataset to a Pyarrow table.

This will contain at minimum the smiles, atom features, and bond features, using the column names specified as arguments. It will also contain any labels that in the entry.

Parameters:
  • atom_feature_column – The name of the column to use for the atom features.

  • bond_feature_column – The name of the column to use for the bond features.

  • smiles_column – The name of the column to use for the SMILES strings.

Returns:

table