Retrieving Result Collections
This example shows how QCSubmit can be used to retrieve the results of quantum chemical (QC) calculations from a QCFractal instance such as QCArchive.
In particular, it demonstrates how:
raw torsion drive, optimised geometry and hessian result records can be retrieved from the public QCArchive server and stored in a result collection
the retrieved result records can be filtered and curated using a set of built-in filters
the result collection can be saved and loaded from disk
For the sake of clarity all verbose warnings will be disabled in this tutorial:
[1]:
import warnings
warnings.filterwarnings("ignore")
import logging
logging.getLogger("openff.toolkit").setLevel(logging.ERROR)
Retrieving result collections
QCSubmit provides a suite of utilities for retrieving and curating collections of QC results directly from a running QCFractal server, or an already computed QCPortal dataset. This functionality is provided through three main classes:
BasicResultCollection
- stores references to simple QCPortal result record that may contain energies, gradients, or hessians computed for a molecule in a single conformation.OptimizationResultCollection
- stores references to full optimization result records (i.e.OptimizationRecord
objects), as well as the final minimised conformer produced by the optimization.TorsionDriveResultCollection
- stores references to full torsion drive result records (i.e.TorsionDriveRecord
objects), as well as the minimum energy conformer associated with each torsion angle that was scanned.
Each of these collections can be generated directly from a running QCFractal
server using the from_server
class method.
We begin by creating a QCPortal PortalClient
instance that will allow us to communicate with the running server at https://api.qcarchive.molssi.org:443
. We also supply the cache_dir
argument as the current directory, which will lead to each downloaded dataset being saved to a SQLite database.
[2]:
from qcportal import PortalClient
qc_client = PortalClient("https://api.qcarchive.molssi.org:443", cache_dir=".")
Many of the datasets we want to retrieve will be quite large, so we also define a small helper function to print only a subset of their records:
[3]:
from copy import deepcopy
def preview_dataset(ds, max_entries=5):
ds = deepcopy(ds)
for key, entry in ds.entries.items():
ds.entries[key] = entry[:max_entries]
print(ds)
We can then use our client to generate our result collections:
[4]:
from openff.qcsubmit.results import (
BasicResultCollection,
OptimizationResultCollection,
TorsionDriveResultCollection,
)
# Pull down the energy result records from the 'OpenFF BCC Refit Study COH v1.0' dataset.
energy_result_collection = BasicResultCollection.from_server(
client=qc_client,
datasets="OpenFF BCC Refit Study COH v1.0",
spec_name="spec_2", # This used to be "resp-2-vacuum", but the spec name was changed in the QCArchive 0.50 migration
)
preview_dataset(energy_result_collection)
# Pull down the optimization records from both the 'OpenFF Gen 2 Opt Set 3 Pfizer Discrepancy' and
# 'OpenFF Gen 2 Opt Set 4 eMolecules Discrepancy' datasets.
optimization_result_collection = OptimizationResultCollection.from_server(
client=qc_client,
datasets=[
"OpenFF Gen 2 Opt Set 3 Pfizer Discrepancy",
"OpenFF Gen 2 Opt Set 4 eMolecules Discrepancy",
],
spec_name="default",
)
preview_dataset(optimization_result_collection)
# Pull down the torsion drive records from the 'OpenFF Protein Capped 3-mer Backbones v1.0' dataset.
torsion_drive_result_collection = TorsionDriveResultCollection.from_server(
client=qc_client,
datasets="OpenFF Protein Capped 3-mer Backbones v1.0",
spec_name="default",
)
preview_dataset(torsion_drive_result_collection)
entries={'https://api.qcarchive.molssi.org:443/': [BasicResult(type='basic', record_id=32651764, cmiles='[H:6][C:1]1([C:2]([C:4]([C:5]([C:3]1([H:10])[H:11])([H:14])[H:15])([H:12])[H:13])([H:8])[H:9])[H:7]', inchi_key='RGSFGYAAUTVSQA-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651734, cmiles='[H:7][C:1]1([C:2]([C:4]([C:6]([C:5]([C:3]1([H:11])[H:12])([H:15])[H:16])([H:17])[H:18])([H:13])[H:14])([H:9])[H:10])[H:8]', inchi_key='XDTMQSROBMDMFD-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651722, cmiles='[H:7][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:9])[H:11])[H:12])[H:10])[H:8]', inchi_key='UHOVQNZJYSORNB-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651809, cmiles='[H:11][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:13])[H:15])[C:7]([H:16])([H:17])[C:8]([H:18])([H:19])[C:9]([H:20])([H:21])[O:10][H:22])[H:14])[H:12]', inchi_key='VAJVDSVGBWFCLW-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651781, cmiles='[H:11][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:13])[H:15])[C:7]([H:16])([H:17])[C:8]([H:18])([H:19])[C:9]([H:20])([H:21])[O:10][H:22])[H:14])[H:12]', inchi_key='VAJVDSVGBWFCLW-UHFFFAOYNA-N')]} provenance={} type='BasicResultCollection'
entries={'https://api.qcarchive.molssi.org:443/': [OptimizationResult(type='optimization', record_id=6091335, cmiles='[H:12][c:1]1[c:3]([c:4]([n:9][c:2]([n:8]1)[H:13])[N:10]2[C:6]([C:5]([C:7]2([H:18])[H:19])([H:14])[H:15])([H:16])[H:17])[Cl:11]', inchi_key='LQDSDOGLFAZWKS-UHFFFAOYNA-N'), OptimizationResult(type='optimization', record_id=6091336, cmiles='[H:14][c:1]1[c:3]([c:4]([n:11][c:2]([n:10]1)[H:15])[N:12]2[C:8]([C:6]([C:5]([C:7]([C:9]2([H:24])[H:25])([H:20])[H:21])([H:16])[H:17])([H:18])[H:19])([H:22])[H:23])[Cl:13]', inchi_key='VNZFCHDSOXTMKC-UHFFFAOYNA-N'), OptimizationResult(type='optimization', record_id=6091265, cmiles='[H:19][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:21])[N:15]2[C:9](=[O:16])[C:7]3=[C:8]([C:10]2=[O:17])[C:12]([C:14]([C:13]([C:11]3([H:23])[H:24])([H:27])[H:28])([H:29])[H:30])([H:25])[H:26])[F:18])[H:22])[H:20]', inchi_key='PINAKBXFAZUJBZ-UHFFFAOYNA-N'), OptimizationResult(type='optimization', record_id=6091454, cmiles='[H:17][c:1]1[c:2]([c:6]([c:10]([c:7]([c:3]1[H:19])[H:23])[C:13](=[O:15])[N:14]([H:26])[c:11]2[c:8]([c:4]([c:5]([c:9]([c:12]2[O:16][H:27])[H:25])[H:21])[H:20])[H:24])[H:22])[H:18]', inchi_key='UYKVWAQEMQDRGG-YHMJCDSINA-N'), OptimizationResult(type='optimization', record_id=6091455, cmiles='[H:17][c:1]1[c:2]([c:6]([c:10]([c:7]([c:3]1[H:19])[H:23])[C:13](=[O:15])[N:14]([H:26])[c:11]2[c:8]([c:4]([c:5]([c:9]([c:12]2[O:16][H:27])[H:25])[H:21])[H:20])[H:24])[H:22])[H:18]', inchi_key='UYKVWAQEMQDRGG-YHMJCDSINA-N')]} provenance={} type='OptimizationResultCollection'
entries={'https://api.qcarchive.molssi.org:443/': [TorsionDriveResult(type='torsion', record_id=104348544, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:30]([H:35])[C@:31]([H:36])([C:32](=[O:33])[N:40]([H:42])[C:41]([H:43])([H:44])[H:45])[C:34]([H:37])([H:38])[H:39])[C:21]([H:27])([H:28])[C:22](=[O:23])[O:24][H:29])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='DFRSMLTYXUADOY-WMXHRXQGNA-N'), TorsionDriveResult(type='torsion', record_id=104348638, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:31]([H:36])[C@:32]([H:37])([C:33](=[O:34])[N:41]([H:43])[C:42]([H:44])([H:45])[H:46])[C:35]([H:38])([H:39])[H:40])[C:21]([H:27])([H:28])[C:22](=[O:23])[N:24]([H:29])[H:30])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='KYQIKKWSTMGEQA-RIYGDZFYNA-N'), TorsionDriveResult(type='torsion', record_id=104348640, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:29]([H:34])[C@:30]([H:35])([C:31](=[O:32])[N:39]([H:41])[C:40]([H:42])([H:43])[H:44])[C:33]([H:36])([H:37])[H:38])[C:21]([H:27])([H:28])[C:22](=[O:24])[O-:23])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='DFRSMLTYXUADOY-YXJNUYKJNA-M'), TorsionDriveResult(type='torsion', record_id=104350274, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:26])[C@:18]([H:27])([C:19](=[O:20])[N:33]([H:38])[C@:34]([H:39])([C:35](=[O:36])[N:43]([H:45])[C:44]([H:46])([H:47])[H:48])[C:37]([H:40])([H:41])[H:42])[C:21]([H:28])([H:29])[C:22]([H:30])([H:31])[C:23](=[O:24])[O:25][H:32])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='YVWAADJXUPAWMQ-MMEUZQAYNA-N'), TorsionDriveResult(type='torsion', record_id=104350994, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:26])[C@:18]([H:27])([C:19](=[O:20])[N:34]([H:39])[C@:35]([H:40])([C:36](=[O:37])[N:44]([H:46])[C:45]([H:47])([H:48])[H:49])[C:38]([H:41])([H:42])[H:43])[C:21]([H:28])([H:29])[C:22]([H:30])([H:31])[C:23](=[O:24])[N:25]([H:32])[H:33])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='WGYPYUBMHJDXLX-XZTNJZJKNA-N')]} provenance={} type='TorsionDriveResultCollection'
Note: currently only complete results are pulled down by the ``from_server`` method
There are two main inputs to the from_server
method, in addition to the fractal client:
the name(s) of the existing datasets to retrieve the results of. This can either be the name of a single dataset or a list of dataset names
the name of the specification used to compute the records. Each specification corresponds to a particular basis, method, program and additional settings.
Let’s print out some basic information about each of these result collections:
[5]:
print("===HESSIAN RESULTS===")
print(f"N RESULTS: {energy_result_collection.n_results}")
print(f"N MOLECULES: {energy_result_collection.n_molecules}")
print("===OPTIMIZATION RESULTS===")
print(f"N RESULTS: {optimization_result_collection.n_results}")
print(f"N MOLECULES: {optimization_result_collection.n_molecules}")
print("===TORSION DRIVE RESULTS===")
print(f"N RESULTS: {torsion_drive_result_collection.n_results}")
print(f"N MOLECULES: {torsion_drive_result_collection.n_molecules}")
===HESSIAN RESULTS===
N RESULTS: 191
N MOLECULES: 91
===OPTIMIZATION RESULTS===
N RESULTS: 2398
N MOLECULES: 419
===TORSION DRIVE RESULTS===
N RESULTS: 23
N MOLECULES: 23
We can easily save / load the collections to / from disk:
[6]:
# save the energy result collection to a JSON file
with open("energy-result-collection.json", "w") as file:
file.write(energy_result_collection.json())
# re-load the serialized result collection
preview_dataset(BasicResultCollection.parse_file("energy-result-collection.json"))
entries={'https://api.qcarchive.molssi.org:443/': [BasicResult(type='basic', record_id=32651764, cmiles='[H:6][C:1]1([C:2]([C:4]([C:5]([C:3]1([H:10])[H:11])([H:14])[H:15])([H:12])[H:13])([H:8])[H:9])[H:7]', inchi_key='RGSFGYAAUTVSQA-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651734, cmiles='[H:7][C:1]1([C:2]([C:4]([C:6]([C:5]([C:3]1([H:11])[H:12])([H:15])[H:16])([H:17])[H:18])([H:13])[H:14])([H:9])[H:10])[H:8]', inchi_key='XDTMQSROBMDMFD-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651722, cmiles='[H:7][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:9])[H:11])[H:12])[H:10])[H:8]', inchi_key='UHOVQNZJYSORNB-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651809, cmiles='[H:11][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:13])[H:15])[C:7]([H:16])([H:17])[C:8]([H:18])([H:19])[C:9]([H:20])([H:21])[O:10][H:22])[H:14])[H:12]', inchi_key='VAJVDSVGBWFCLW-UHFFFAOYNA-N'), BasicResult(type='basic', record_id=32651781, cmiles='[H:11][c:1]1[c:2]([c:4]([c:6]([c:5]([c:3]1[H:13])[H:15])[C:7]([H:16])([H:17])[C:8]([H:18])([H:19])[C:9]([H:20])([H:21])[O:10][H:22])[H:14])[H:12]', inchi_key='VAJVDSVGBWFCLW-UHFFFAOYNA-N')]} provenance={} type='BasicResultCollection'
Each of these collections will store the referenced results in their entries
dictionary. This dictionary uses the address of the QCFractal server as keys:
[7]:
torsion_drive_result_collection.entries.keys()
[7]:
dict_keys(['https://api.qcarchive.molssi.org:443/'])
This allows results generated by multiple different servers (e.g. a local fractal instance and the public QCArchive server) to be stored in a single result collection object.
The references to the actual data are then stored in corresponding lists:
[8]:
torsion_drive_result_collection.entries[qc_client.address][:10]
[8]:
[TorsionDriveResult(type='torsion', record_id=104348544, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:30]([H:35])[C@:31]([H:36])([C:32](=[O:33])[N:40]([H:42])[C:41]([H:43])([H:44])[H:45])[C:34]([H:37])([H:38])[H:39])[C:21]([H:27])([H:28])[C:22](=[O:23])[O:24][H:29])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='DFRSMLTYXUADOY-WMXHRXQGNA-N'),
TorsionDriveResult(type='torsion', record_id=104348638, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:31]([H:36])[C@:32]([H:37])([C:33](=[O:34])[N:41]([H:43])[C:42]([H:44])([H:45])[H:46])[C:35]([H:38])([H:39])[H:40])[C:21]([H:27])([H:28])[C:22](=[O:23])[N:24]([H:29])[H:30])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='KYQIKKWSTMGEQA-RIYGDZFYNA-N'),
TorsionDriveResult(type='torsion', record_id=104348640, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:29]([H:34])[C@:30]([H:35])([C:31](=[O:32])[N:39]([H:41])[C:40]([H:42])([H:43])[H:44])[C:33]([H:36])([H:37])[H:38])[C:21]([H:27])([H:28])[C:22](=[O:24])[O-:23])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='DFRSMLTYXUADOY-YXJNUYKJNA-M'),
TorsionDriveResult(type='torsion', record_id=104350274, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:26])[C@:18]([H:27])([C:19](=[O:20])[N:33]([H:38])[C@:34]([H:39])([C:35](=[O:36])[N:43]([H:45])[C:44]([H:46])([H:47])[H:48])[C:37]([H:40])([H:41])[H:42])[C:21]([H:28])([H:29])[C:22]([H:30])([H:31])[C:23](=[O:24])[O:25][H:32])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='YVWAADJXUPAWMQ-MMEUZQAYNA-N'),
TorsionDriveResult(type='torsion', record_id=104350994, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:26])[C@:18]([H:27])([C:19](=[O:20])[N:34]([H:39])[C@:35]([H:40])([C:36](=[O:37])[N:44]([H:46])[C:45]([H:47])([H:48])[H:49])[C:38]([H:41])([H:42])[H:43])[C:21]([H:28])([H:29])[C:22]([H:30])([H:31])[C:23](=[O:24])[N:25]([H:32])[H:33])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='WGYPYUBMHJDXLX-XZTNJZJKNA-N'),
TorsionDriveResult(type='torsion', record_id=104352631, cmiles='[H:30][C@@:25]([C:26](=[O:27])[N:34]([H:36])[C:35]([H:37])([H:38])[H:39])([C:28]([H:31])([H:32])[H:33])[N:24]([H:29])[C:19](=[O:20])[C:18]([H:22])([H:23])[N:17]([H:21])[C:9](=[O:10])[C@:8]([H:13])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='MSZWGDWDOLQBKL-IABNWRJLNA-N'),
TorsionDriveResult(type='torsion', record_id=104353304, cmiles='[H:32][C:24]1=[C:22]([N:23]([C:25](=[N:26]1)[H:33])[H:31])[C:21]([H:29])([H:30])[C@@:18]([H:28])([C:19](=[O:20])[N:34]([H:39])[C@:35]([H:40])([C:36](=[O:37])[N:44]([H:46])[C:45]([H:47])([H:48])[H:49])[C:38]([H:41])([H:42])[H:43])[N:17]([H:27])[C:9](=[O:10])[C@:8]([H:13])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='AEUGKYSAGBPNIR-BOUJXIRCNA-N'),
TorsionDriveResult(type='torsion', record_id=104353308, cmiles='[H:32][C:24]1=[C:22]([N+:23](=[C:25]([N:26]1[H:34])[H:33])[H:31])[C:21]([H:29])([H:30])[C@@:18]([H:28])([C:19](=[O:20])[N:35]([H:40])[C@:36]([H:41])([C:37](=[O:38])[N:45]([H:47])[C:46]([H:48])([H:49])[H:50])[C:39]([H:42])([H:43])[H:44])[N:17]([H:27])[C:9](=[O:10])[C@:8]([H:13])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='AEUGKYSAGBPNIR-ICFOULITNA-O'),
TorsionDriveResult(type='torsion', record_id=104353312, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:36]([H:41])[C@:37]([H:42])([C:38](=[O:39])[N:46]([H:48])[C:47]([H:49])([H:50])[H:51])[C:40]([H:43])([H:44])[H:45])[C:21]([H:27])([H:28])[C:22]([H:29])([C:23]([H:30])([H:31])[H:32])[C:24]([H:33])([H:34])[H:35])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='JTPXVQPOMDABBJ-QNCMGGKINA-N'),
TorsionDriveResult(type='torsion', record_id=104353318, cmiles='[H:13][C@@:8]([C:9](=[O:10])[N:17]([H:25])[C@:18]([H:26])([C:19](=[O:20])[N:34]([H:39])[C@:35]([H:40])([C:36](=[O:37])[N:44]([H:46])[C:45]([H:47])([H:48])[H:49])[C:38]([H:41])([H:42])[H:43])[C:21]([H:27])([H:28])[C:22]([H:29])([H:30])[S:23][C:24]([H:31])([H:32])[H:33])([C:11]([H:14])([H:15])[H:16])[N:7]([H:12])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='ZGBZZLLRJFEPFE-SLHFYQOHNA-N')]
After running the above command, notice that the entries stored in the collection are not the actual result records generated and stored on the server, but rather a reference to them. In particular, the unique ID of the record is stored along with a SMILES depiction of the molecule the result was generated for.
The main reason for doing this is that we often would like to be able to state which data we would like to use in an application without having to create multiple copies of the data. Not only can this take up large amounts of disk space, it runs the risk of data becoming out of sync with the original if the format the records are stored in changes or the local copy of the data is accidentally mutated. Storing a reference to the original data and then retrieving it when needed is typically a cleaner and safer solution.
Retrieving the result records
The raw result record objects can be easily retrieved using the result collection objects. This allows us to filter the collection to only retrieve the results we want. We’ll talk more about filters in an upcoming section, but for now we just apply a SMARTS string to limit our record download to the cysteine record:
[9]:
from openff.qcsubmit.results.filters import SMARTSFilter
filtered_collection = torsion_drive_result_collection.filter(
SMARTSFilter(smarts_to_include=["C[SH]"])
)
filtered_collection
[9]:
TorsionDriveResultCollection(entries={'https://api.qcarchive.molssi.org:443/': [TorsionDriveResult(type='torsion', record_id=104349083, cmiles='[H:30][C@@:24]([C:25](=[O:26])[N:34]([H:41])[C@:35]([H:42])([C:36](=[O:37])[N:50]([H:52])[C:51]([H:53])([H:54])[H:55])[C:38]([H:43])([C:39]([H:44])([H:45])[H:46])[C:40]([H:47])([H:48])[H:49])([C:27]([H:31])([H:32])[S:28][H:33])[N:23]([H:29])[C:9](=[O:10])[C@:8]([H:15])([C:11]([H:16])([C:12]([H:17])([H:18])[H:19])[C:13]([H:20])([H:21])[H:22])[N:7]([H:14])[C:1](=[O:2])[C:3]([H:4])([H:5])[H:6]', inchi_key='NXZIIAGLOSBSES-BCDINFCLNA-N')]}, provenance={'applied-filters': {'SMARTSFilter-0': {'smarts_to_include': ['C[SH]'], 'smarts_to_exclude': None}}}, type='TorsionDriveResultCollection')
Then we can download the actual results. This can be very slow, so it’s worth filtering aggressively:
[10]:
torsion_drive_records = filtered_collection.to_records()
torsion_drive_records
[10]:
[(TorsiondriveRecord(id=104349083, record_type='torsiondrive', is_service=True, properties={}, extras={}, status=<RecordStatusEnum.complete: 'complete'>, manager_name=None, created_on=datetime.datetime(2022, 5, 31, 3, 11, 41, 558377, tzinfo=datetime.timezone.utc), modified_on=datetime.datetime(2022, 5, 31, 3, 11, 41, 558375, tzinfo=datetime.timezone.utc), owner_user=None, owner_group=None, compute_history_=None, task_=None, service_=None, comments_=None, native_files_=None, specification=TorsiondriveSpecification(program='torsiondrive', optimization_specification=OptimizationSpecification(program='geometric', qc_specification=QCSpecification(program='psi4', driver=<SinglepointDriver.deferred: 'deferred'>, method='b3lyp-d3bj', basis='dzvp', keywords={'maxiter': 200, 'scf_properties': ['dipole', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices']}, protocols=AtomicResultProtocols(wavefunction=<WavefunctionProtocolEnum.none: 'none'>, stdout=True, error_correction=ErrorCorrectionProtocol(default_policy=True, policies=None), native_files=<NativeFilesProtocolEnum.none: 'none'>)), keywords={'tmax': 0.3, 'check': 0, 'qccnv': True, 'reset': True, 'trust': 0.1, 'molcnv': False, 'enforce': 0.1, 'epsilon': 0, 'maxiter': 300, 'coordsys': 'dlc', 'constraints': {'set': [{'type': 'dihedral', 'value': -65.0, 'indices': [22, 23, 26, 27]}]}, 'convergence_set': 'gau'}, protocols=OptimizationProtocols(trajectory=<TrajectoryProtocolEnum.all: 'all'>)), keywords=TorsiondriveKeywords(dihedrals=[(8, 22, 23, 24), (22, 23, 24, 33)], grid_spacing=[15, 15], dihedral_ranges=[(-180, 165), (-180, 165)], energy_decrease_thresh=None, energy_upper_limit=0.05)), initial_molecules_ids_=None, initial_molecules_=None, optimizations_=None, minimum_optimizations_={'[0, 0]': 108906495, '[0, 15]': 105861438, '[0, 30]': 107405834, '[0, 45]': 110088520, '[0, 60]': 109544346, '[0, 75]': 105861454, '[0, 90]': 105861457, '[15, 0]': 110088528, '[30, 0]': 110303147, '[45, 0]': 109544448, '[60, 0]': 109544498, '[75, 0]': 110088636, '[90, 0]': 108906792, '[-15, 0]': 105861336, '[-30, 0]': 107405701, '[-45, 0]': 105861145, '[-60, 0]': 108906351, '[-75, 0]': 108906329, '[-90, 0]': 108906315, '[0, -15]': 107405827, '[0, -30]': 107405825, '[0, -45]': 110640763, '[0, -60]': 110542489, '[0, -75]': 105861411, '[0, -90]': 107405814, '[0, 105]': 104355892, '[0, 120]': 107405852, '[0, 135]': 108906509, '[0, 150]': 108906511, '[0, 165]': 107405861, '[0, 180]': 107405863, '[105, 0]': 107406269, '[120, 0]': 105862217, '[135, 0]': 105862332, '[15, 15]': 110303132, '[15, 30]': 110088532, '[15, 45]': 110303136, '[15, 60]': 108906551, '[15, 75]': 109544374, '[15, 90]': 105861572, '[150, 0]': 105862426, '[165, 0]': 107406447, '[180, 0]': 109544622, '[30, 15]': 110088554, '[30, 30]': 109544408, '[30, 45]': 110088557, '[30, 60]': 109544410, '[30, 75]': 110088559, '[30, 90]': 107406009, '[45, 15]': 110303172, '[45, 30]': 110303175, '[45, 45]': 110303177, '[45, 60]': 110088589, '[45, 75]': 109544459, '[45, 90]': 108906673, '[60, 15]': 110303198, '[60, 30]': 110542518, '[60, 45]': 110088618, '[60, 60]': 110088619, '[60, 75]': 110088621, '[60, 90]': 109544512, '[75, 15]': 110088638, '[75, 30]': 109544547, '[75, 45]': 109544550, '[75, 60]': 109544553, '[75, 75]': 109544554, '[75, 90]': 110088647, '[90, 15]': 109544578, '[90, 30]': 108906797, '[90, 45]': 109544582, '[90, 60]': 110088655, '[90, 75]': 110088656, '[90, 90]': 110303226, '[-105, 0]': 110088443, '[-120, 0]': 110303012, '[-135, 0]': 110088385, '[-15, 15]': 104355862, '[-15, 30]': 108906457, '[-15, 45]': 107405771, '[-15, 60]': 109544334, '[-15, 75]': 108906465, '[-15, 90]': 105861359, '[-150, 0]': 109544168, '[-165, 0]': 108906241, '[-30, 15]': 107405703, '[-30, 30]': 107405705, '[-30, 45]': 107405710, '[-30, 60]': 105861257, '[-30, 75]': 107405716, '[-30, 90]': 107405718, '[-45, 15]': 108906389, '[-45, 30]': 107405651, '[-45, 45]': 108906390, '[-45, 60]': 107405657, '[-45, 75]': 107405661, '[-45, 90]': 105861168, '[-60, 15]': 109544279, '[-60, 30]': 105861055, '[-60, 45]': 109544280, '[-60, 60]': 109544281, '[-60, 75]': 105861067, '[-60, 90]': 107405618, '[-75, 15]': 108906332, '[-75, 30]': 109544253, '[-75, 45]': 110088480, '[-75, 60]': 108906335, '[-75, 75]': 109544259, '[-75, 90]': 110088483, '[-90, 15]': 110303087, '[-90, 30]': 110088460, '[-90, 45]': 110303088, '[-90, 60]': 110088463, '[-90, 75]': 110088464, '[-90, 90]': 110303092, '[0, -105]': 108906486, '[0, -120]': 109544342, '[0, -135]': 107405807, '[0, -150]': 105861394, '[0, -165]': 108906478, '[105, 15]': 108906824, '[105, 30]': 107406272, '[105, 45]': 107406276, '[105, 60]': 105862138, '[105, 75]': 110303229, '[105, 90]': 107406283, '[120, 15]': 104356077, '[120, 30]': 105862225, '[120, 45]': 105862227, '[120, 60]': 104356080, '[120, 75]': 105862236, '[120, 90]': 105862242, '[135, 15]': 105862333, '[135, 30]': 105862338, '[135, 45]': 105862341, '[135, 60]': 107406338, '[135, 75]': 105862349, '[135, 90]': 107406342, '[15, -15]': 110088527, '[15, -30]': 109544367, '[15, -45]': 110542494, '[15, -60]': 110640767, '[15, -75]': 110542492, '[15, -90]': 108906530, '[15, 105]': 105861576, '[15, 120]': 105861579, '[15, 135]': 108906557, '[15, 150]': 107405948, '[15, 165]': 109544380, '[15, 180]': 107405953, '[150, 15]': 107406378, '[150, 30]': 105862435, '[150, 45]': 107406382, '[150, 60]': 105862442, '[150, 75]': 107406387, '[150, 90]': 105862450, '[165, 15]': 105862525, '[165, 30]': 109544600, '[165, 45]': 110088664, '[165, 60]': 105862539, '[165, 75]': 105862541, '[165, 90]': 107406470, '[180, 15]': 108906878, '[180, 30]': 109544625, '[180, 45]': 108906884, '[180, 60]': 109544630, '[180, 75]': 109544632, '[180, 90]': 109544634, '[30, -15]': 109544399, '[30, -30]': 110088550, '[30, -45]': 110303142, '[30, -60]': 110542499, '[30, -75]': 110303139, '[30, -90]': 110640769, '[30, 105]': 107406012, '[30, 120]': 108906610, '[30, 135]': 108906613, '[30, 150]': 109544420, '[30, 165]': 109544421, '[30, 180]': 109544424, '[45, -15]': 109544445, '[45, -30]': 109544442, '[45, -45]': 110303168, '[45, -60]': 110088576, '[45, -75]': 110542505, '[45, -90]': 110748886, '[45, 105]': 108906677, '[45, 120]': 109544466, '[45, 135]': 109544467, '[45, 150]': 110088594, '[45, 165]': 108906688, '[45, 180]': 109544476, '[60, -15]': 110088609, '[60, -30]': 108906703, '[60, -45]': 108906701, '[60, -60]': 109544487, '[60, -75]': 110542516, '[60, -90]': 110640779, '[60, 105]': 109544515, '[60, 120]': 107406147, '[60, 135]': 109544519, '[60, 150]': 109544521, '[60, 165]': 109544524, '[60, 180]': 108906728, '[75, -15]': 108906749, '[75, -30]': 109544539, '[75, -45]': 107406169, '[75, -60]': 110088635, '[75, -75]': 110640785, '[75, -90]': 110748890, '[75, 105]': 108906761, '[75, 120]': 108906763, '[75, 135]': 107406189, '[75, 150]': 108906766, '[75, 165]': 108906768, '[75, 180]': 109544567, '[90, -15]': 107406217, '[90, -30]': 107406214, '[90, -45]': 105862014, '[90, -60]': 108906784, '[90, -75]': 109544573, '[90, -90]': 105862002, '[90, 105]': 105862052, '[90, 120]': 109544586, '[90, 135]': 108906804, '[90, 150]': 108906805, '[90, 165]': 107406240, '[90, 180]': 108906808, '[-105, 15]': 110088445, '[-105, 30]': 110303059, '[-105, 45]': 110088447, '[-105, 60]': 110088450, '[-105, 75]': 110303065, '[-105, 90]': 110088452, '[-120, 15]': 110542406, '[-120, 30]': 110542407, '[-120, 45]': 110748868, '[-120, 60]': 110640725, '[-120, 75]': 110542411, '[-120, 90]': 110640726, '[-135, 15]': 110302960, '[-135, 30]': 110542375, '[-135, 45]': 110640712, '[-135, 60]': 110542378, '[-135, 75]': 110302967, '[-135, 90]': 110088396, '[-15, -15]': 105861334, '[-15, -30]': 105861329, '[-15, -45]': 107405758, '[-15, -60]': 107405754, '[-15, -75]': 107405753, '[-15, -90]': 107405751, '[-15, 105]': 105861365, '[-15, 120]': 105861368, '[-15, 135]': 108906468, '[-15, 150]': 107405791, '[-15, 165]': 108906474, '[-15, 180]': 108906476, '[-150, 15]': 110088338, '[-150, 30]': 109544169, '[-150, 45]': 110542361, '[-150, 60]': 110640709, '[-150, 75]': 110088346, '[-150, 90]': 110302928, '[-165, 15]': 109544118, '[-165, 30]': 110088301, '[-165, 45]': 110088302, '[-165, 60]': 109544124, '[-165, 75]': 107405484, '[-165, 90]': 110088306, '[-30, -15]': 104355836, '[-30, -30]': 107405695, '[-30, -45]': 108906423, '[-30, -60]': 108906422, '[-30, -75]': 107405690, '[-30, -90]': 108906418, '[-30, 105]': 105861267, '[-30, 120]': 105861273, '[-30, 135]': 107405726, '[-30, 150]': 108906433, '[-30, 165]': 107405731, '[-30, 180]': 108906439, '[-45, -15]': 107405642, '[-45, -30]': 105861135, '[-45, -45]': 107405638, '[-45, -60]': 107405635, '[-45, -75]': 107405634, '[-45, -90]': 108906381, '[-45, 105]': 107405666, '[-45, 120]': 108906396, '[-45, 135]': 108906397, '[-45, 150]': 110303121, '[-45, 165]': 108906400, '[-45, 180]': 108906401, '[-60, -15]': 108906350, '[-60, -30]': 107405602, '[-60, -45]': 105861037, '[-60, -60]': 105861033, '[-60, -75]': 105861029, '[-60, -90]': 107405594, '[-60, 105]': 108906364, '[-60, 120]': 110088499, '[-60, 135]': 110303118, '[-60, 150]': 110088502, '[-60, 165]': 109544289, '[-60, 180]': 110088504, '[-75, -15]': 105860949, '[-75, -30]': 105860944, '[-75, -45]': 104355762, '[-75, -60]': 105860937, '[-75, -75]': 105860931, '[-75, -90]': 107405575, '[-75, 105]': 110303106, '[-75, 120]': 110088486, '[-75, 135]': 110088488, '[-75, 150]': 110640758, '[-75, 165]': 110542486, '[-75, 180]': 109544264, '[-90, -15]': 108906314, '[-90, -30]': 107405567, '[-90, -45]': 105860845, '[-90, -60]': 108906310, '[-90, -75]': 109544231, '[-90, -90]': 108906309, '[-90, 105]': 110088468, '[-90, 120]': 110640752, '[-90, 135]': 110640754, '[-90, 150]': 110542473, '[-90, 165]': 110640757, '[-90, 180]': 110542479, '[105, -15]': 105862118, '[105, -30]': 105862111, '[105, -45]': 105862108, '[105, -60]': 107406259, '[105, -75]': 107406258, '[105, -90]': 107406255, '[105, 105]': 105862147, '[105, 120]': 105862153, '[105, 135]': 107406286, '[105, 150]': 107406288, '[105, 165]': 105862163, '[105, 180]': 104356064, '[120, -15]': 105862214, '[120, -30]': 105862209, '[120, -45]': 105862204, '[120, -60]': 105862199, '[120, -75]': 105862196, '[120, -90]': 105862192, '[120, 105]': 105862246, '[120, 120]': 105862248, '[120, 135]': 105862253, '[120, 150]': 104356086, '[120, 165]': 104356087, '[120, 180]': 104356088, '[135, -15]': 104356099, '[135, -30]': 104356098, '[135, -45]': 105862302, '[135, -60]': 105862296, '[135, -75]': 104356095, '[135, -90]': 105862290, '[135, 105]': 105862359, '[135, 120]': 107406343, '[135, 135]': 107406344, '[135, 150]': 105862371, '[135, 165]': 104356111, '[135, 180]': 105862379, '[15, -105]': 107405878, '[15, -120]': 108906526, '[15, -135]': 108906524, '[15, -150]': 110088522, '[15, -165]': 108906521, '[150, -15]': 105862421, '[150, -30]': 105862417, '[150, -45]': 107406369, '[150, -60]': 105862410, '[150, -75]': 105862405, '[150, -90]': 105862401, '[150, 105]': 105862455, '[150, 120]': 105862458, '[150, 135]': 107406396, '[150, 150]': 105862467, '[150, 165]': 105862472, '[150, 180]': 107406404, '[165, -15]': 105862517, '[165, -30]': 107406443, '[165, -45]': 107406438, '[165, -60]': 107406434, '[165, -75]': 107406431, '[165, -90]': 105862498, '[165, 105]': 107406473, '[165, 120]': 105862555, '[165, 135]': 107406481, '[165, 150]': 107406485, '[165, 165]': 104356159, '[165, 180]': 108906855, '[180, -15]': 108906876, '[180, -30]': 109544617, '[180, -45]': 107406522, '[180, -60]': 105862601, '[180, -75]': 109544614, '[180, -90]': 108906867, '[180, 105]': 108906896, '[180, 120]': 107406562, '[180, 135]': 108906900, '[180, 150]': 105862657, '[180, 165]': 105862661, '[180, 180]': 108906905, '[30, -105]': 108906576, '[30, -120]': 109544390, '[30, -135]': 109544387, '[30, -150]': 109544385, '[30, -165]': 108906571, '[45, -105]': 110640773, '[45, -120]': 110303160, '[45, -135]': 110088567, '[45, -150]': 109544429, '[45, -165]': 109544428, '[60, -105]': 110748888, '[60, -120]': 110303189, '[60, -135]': 110542509, '[60, -150]': 110088598, '[60, -165]': 109544480, '[75, -105]': 110640784, '[75, -120]': 110542520, '[75, -135]': 110303211, '[75, -150]': 110303209, '[75, -165]': 110088631, '[90, -105]': 108906780, '[90, -120]': 107406201, '[90, -135]': 110303223, '[90, -150]': 110088650, '[90, -165]': 107406196, '[-105, -15]': 108906305, '[-105, -30]': 107405557, '[-105, -45]': 107405556, '[-105, -60]': 107405554, '[-105, -75]': 110088441, '[-105, -90]': 109544220, '[-105, 105]': 110303067, '[-105, 120]': 110640739, '[-105, 135]': 110640741, '[-105, 150]': 110847232, '[-105, 165]': 110748881, '[-105, 180]': 110303075, '[-120, -15]': 110542402, '[-120, -30]': 110542400, '[-120, -45]': 110303006, '[-120, -60]': 110542396, '[-120, -75]': 110640721, '[-120, -90]': 110302998, '[-120, 105]': 110088433, '[-120, 120]': 110640728, '[-120, 135]': 110640729, '[-120, 150]': 110748873, '[-120, 165]': 110640731, '[-120, 180]': 110542420, '[-135, -15]': 110302956, '[-135, -30]': 110542372, '[-135, -45]': 110542371, '[-135, -60]': 110302946, '[-135, -75]': 110542368, '[-135, -90]': 110088372, '[-135, 105]': 110302973, '[-135, 120]': 110088401, '[-135, 135]': 110302977, '[-135, 150]': 110302979, '[-135, 165]': 110542381, '[-135, 180]': 110542383, '[-15, -105]': 108906450, '[-15, -120]': 107405747, '[-15, -135]': 107405743, '[-15, -150]': 107405740, '[-15, -165]': 105861292, '[-150, -15]': 110088332, '[-150, -30]': 110088331, '[-150, -45]': 110088328, '[-150, -60]': 110640708, '[-150, -75]': 110302916, '[-150, -90]': 110302915, '[-150, 105]': 110088351, '[-150, 120]': 110302930, '[-150, 135]': 109544184, '[-150, 150]': 110302931, '[-150, 165]': 108906285, '[-150, 180]': 110302932, '[-165, -15]': 108906239, '[-165, -30]': 109544109, '[-165, -45]': 109544106, '[-165, -60]': 110542356, '[-165, -75]': 110640705, '[-165, -90]': 109544101, '[-165, 105]': 110088307, '[-165, 120]': 109544134, '[-165, 135]': 109544135, '[-165, 150]': 108906262, '[-165, 165]': 108906264, '[-165, 180]': 108906266, '[-30, -105]': 108906414, '[-30, -120]': 109544319, '[-30, -135]': 109544318, '[-30, -150]': 110088515, '[-30, -165]': 109544315, '[-45, -105]': 109544301, '[-45, -120]': 109544299, '[-45, -135]': 108906371, '[-45, -150]': 109544292, '[-45, -165]': 108906369, '[-60, -105]': 108906344, '[-60, -120]': 108906342, '[-60, -135]': 109544271, '[-60, -150]': 110088492, '[-60, -165]': 109544266, '[-75, -105]': 110088479, '[-75, -120]': 109544247, '[-75, -135]': 110303102, '[-75, -150]': 110542481, '[-75, -165]': 110088471, '[-90, -105]': 109544230, '[-90, -120]': 110088455, '[-90, -135]': 110542464, '[-90, -150]': 110640751, '[-90, -165]': 110847234, '[105, -105]': 105862092, '[105, -120]': 108906816, '[105, -135]': 107406250, '[105, -150]': 108906813, '[105, -165]': 105862076, '[120, -105]': 104356069, '[120, -120]': 105862186, '[120, -135]': 105862181, '[120, -150]': 105862178, '[120, -165]': 105862174, '[135, -105]': 105862285, '[135, -120]': 105862281, '[135, -135]': 107406325, '[135, -150]': 104356090, '[135, -165]': 105862267, '[150, -105]': 105862398, '[150, -120]': 105862394, '[150, -135]': 105862389, '[150, -150]': 105862386, '[150, -165]': 105862381, '[165, -105]': 107406424, '[165, -120]': 105862490, '[165, -135]': 107406417, '[165, -150]': 107406414, '[165, -165]': 105862479, '[180, -105]': 107406508, '[180, -120]': 108906863, '[180, -135]': 109544609, '[180, -150]': 105862577, '[180, -165]': 107406495, '[-105, -105]': 110088440, '[-105, -120]': 110542430, '[-105, -135]': 110640735, '[-105, -150]': 110640734, '[-105, -165]': 110748875, '[-120, -105]': 110542391, '[-120, -120]': 110542390, '[-120, -135]': 110302990, '[-120, -150]': 110302988, '[-120, -165]': 110542384, '[-135, -105]': 110542366, '[-135, -120]': 110302938, '[-135, -135]': 110542365, '[-135, -150]': 110302935, '[-135, -165]': 110302934, '[-150, -105]': 109544154, '[-150, -120]': 109544151, '[-150, -135]': 108906271, '[-150, -150]': 109544146, '[-150, -165]': 110302914, '[-165, -105]': 109544099, '[-165, -120]': 109544096, '[-165, -135]': 108906225, '[-165, -150]': 109544089, '[-165, -165]': 110088285}),
Molecule with name '' and SMILES '[H][C@@](C(=O)N([H])[C@]([H])(C(=O)N([H])C([H])([H])[H])C([H])(C([H])([H])[H])C([H])([H])[H])(C([H])([H])S[H])N([H])C(=O)[C@]([H])(C([H])(C([H])([H])[H])C([H])([H])[H])N([H])C(=O)C([H])([H])[H]')]
QCSubmit seamlessly takes care of pulling the data from the server in the most efficient way making sure to take advantage of the pagination that QCFractal provides. Further, it attempts to cache all calls to the server so that multiple calls to to_records
does not need to constantly query the server.
Notice that not only are the raw result records retrieved, but also an OpenFF Molecule
object is created for each result record. This molecule has the correct ordering and also stores any conformers associated with the result collection. For basic collections, the conformer is the one that was used in any calculations; for optimization collections, it is the final conformer yielded by the optimization; and for torsion drives, it is the lowest energy conformer for each sampled torsion angle.
Inspecting results
In the case of torsion drive records, we can easily iterate over the grid ID, the associated conformer, and the associated energy in one go:
[11]:
torsion_drive_record, molecule = torsion_drive_records[0]
for grid_id, qc_conformer in zip(
molecule.properties["grid_ids"][:10], molecule.conformers
):
qc_energy = torsion_drive_record.final_energies[grid_id]
print(f"{grid_id} E={qc_energy:.4f} Ha")
(-165, -165) E=-1546.1642 Ha
(-165, -150) E=-1546.1627 Ha
(-165, -135) E=-1546.1621 Ha
(-165, -120) E=-1546.1617 Ha
(-165, -105) E=-1546.1625 Ha
(-165, -90) E=-1546.1642 Ha
(-165, -75) E=-1546.1666 Ha
(-165, -60) E=-1546.1682 Ha
(-165, -45) E=-1546.1684 Ha
(-165, -30) E=-1546.1689 Ha
Basic results from optimization results
It is common for certain datasets within a QCFractal server to be created using the output of another dataset. This is especially the case for datasets of hessian records that are computed using the conformer produced by an optimization.
The OptimizationResultCollection
currently provides a to_basic_result_collection
method to handle such cases. This can take some time to run:
[13]:
derived_hessian_collection = optimization_result_collection.to_basic_result_collection(
driver="hessian"
)
derived_hessian_collection.n_results
[13]:
2378
This is a particularly useful way to access hessian data contained within older datasets. Older datasets do not usually store SMILES information for their result records, and hence it can be difficult to know exactly which molecule the hessian was computed for. The to_basic_result_collection
method takes care of this by propagating SMILES information from the parent optimization record down to the child hessian result record.
In addition to retrieving already computed datasets, the optimization result collection provides a utility for generating a new QC dataset based on the optimized conformers:
[14]:
from qcportal.singlepoint import SinglepointDriver
hessian_dataset = optimization_result_collection.create_basic_dataset(
dataset_name="My Dataset",
description="A dataset created from an optimization result collection.",
tagline="Contains hessian data.",
driver=SinglepointDriver.hessian,
)
The resulting dataset can then be submitted to a running QCFractal server.
Filtering result collections
A powerful feature of the result collections is the ability to easily filter the entries it contains using a diverse range of filters, such as filtering out specific molecules based on SMILES patterns, records where the connectivity of the molecule changed during the optimization, or much more!
The built-in filters are stored in the openff.qcsubmit.results.filters
module:
[15]:
from openff.qcsubmit.results import filters
Let’s apply some basic filters to our optimization collection:
[16]:
from qcportal.record_models import RecordStatusEnum
filtered_collection = optimization_result_collection.filter(
filters.RecordStatusFilter(status=RecordStatusEnum.complete),
filters.ConnectivityFilter(tolerance=1.2),
filters.ElementFilter(
# The elements supported by OpenFF 1.3.0
allowed_elements=["H", "C", "N", "O", "S", "P", "F", "Cl", "Br", "I"]
),
filters.ConformerRMSDFilter(max_conformers=10),
)
print("===========")
print(f"N RECORDS INITIAL: {optimization_result_collection.n_results}")
print(f"N RECORDS FINAL: {filtered_collection.n_results}")
print(f"N MOLECULES INITIAL: {optimization_result_collection.n_molecules}")
print(f"N MOLECULES FINAL: {filtered_collection.n_molecules}")
print("===========")
===========
N RECORDS INITIAL: 2398
N RECORDS FINAL: 1567
N MOLECULES INITIAL: 419
N MOLECULES FINAL: 419
===========
Here we have removed:
any incomplete records using the
RecordStatusFilter
records whose whereby a connectivity during the computation, e.g. a proton transfer occurred using the
ConnectivityFilter
records that were computed for molecules composed of elements that are not supported by the current OpenFF force fields
and finally, a ConformerRMSDFilter
was applied. When a collection contains multiple optimized conformers for the same molecule, the ConformerRMSDFilter
will only retain up to a maximum number of conformers for that molecule that are distinct to within a specified RMSD tolerance.
We could have also made use of the LowestEnergyFilter
to only retain the lowest energy conformer associated with each unique molecule in the collection.
The filtered result collection will record provenance information about which filters were applied:
[17]:
filtered_collection.provenance
[17]:
{'applied-filters': {'RecordStatusFilter-0': {'status': <RecordStatusEnum.complete: 'complete'>},
'ConnectivityFilter-1': {'tolerance': 1.2},
'ElementFilter-2': {'allowed_elements': ['H',
'C',
'N',
'O',
'S',
'P',
'F',
'Cl',
'Br',
'I']},
'ConformerRMSDFilter-3': {'max_conformers': 10,
'rmsd_tolerance': 0.5,
'heavy_atoms_only': True,
'check_automorphs': True}}}
Additional utilities
In addition to providing an interface for curating collections of QC results, the result collection objects also expose a number of quality of life utilities for visualizing and analysing the stored results.
A pdf showing the molecules within a result collection can be easily generated:
[18]:
from openff.toolkit import Molecule
from openff.qcsubmit.utils.visualize import molecules_to_pdf
# We filter by inchi key to make sure that we don't double count molecules
# with different orderings.
unique_smiles = set(
{
entry.inchi_key: entry.cmiles
for entries in energy_result_collection.entries.values()
for entry in entries
}.values()
)
molecules = [
Molecule.from_mapped_smiles(smiles, allow_undefined_stereo=True)
for smiles in unique_smiles
]
molecules_to_pdf(molecules, "energy-result-collection.pdf", columns=8)
Advanced visualization
Beyond the basic summary data shown above, we can also directly visualize the torsion drive using interactive plotly
graphs and the molecule.visualize("nglview")
function from the OpenFF Toolkit.
[12]:
import numpy as np
import plotly.graph_objects as go
from ipywidgets import widgets
torsion_drive_record, molecule = torsion_drive_records[0]
energy_grid = np.zeros((24, 24))
psi_labels = [""] * 24
phi_labels = [""] * 24
for (phi, psi), qc_conformer in zip(
molecule.properties["grid_ids"], molecule.conformers
):
qc_energy = torsion_drive_record.final_energies[(phi, psi)]
phi_bin = int(phi + 165) // 15
psi_bin = int(psi + 165) // 15
energy_grid[psi_bin, phi_bin] = qc_energy
psi_labels[psi_bin] = psi
phi_labels[phi_bin] = phi
fig = go.FigureWidget(
data=go.Heatmap(
z=energy_grid,
x=phi_labels,
y=psi_labels,
colorbar={"title": "Energy (Ha)"},
hovertemplate="phi: %{x}\npsi: %{y}\nenergy: %{z} Ha",
),
layout=go.Layout(
title="Val-Ala-Val - central backbone torsiondrive (Ha)",
xaxis_title="Phi",
yaxis_title="Psi",
# autosize=False,
yaxis_scaleanchor="x",
xaxis_scaleanchor="y",
),
)
view = molecule.visualize("nglview")
def on_click(trace, points, selector):
print(points)
for x, y in points.point_inds:
view.frame = x * 24 + y
heatmap = fig.data[0]
heatmap.on_click(on_click)
container = widgets.GridBox(
[fig, view],
)
container
[12]:
Cached queries
If you provided the cache_dir
argument to your PortalClient
initially, providing the same cache_dir
to future clients allows them to share the underlying SQLite cache. This socket.socket
trick temporarily disables access to the network just to demonstrate that the new calls to get_dataset
and BasicResultCollection.from_datasets
can be performed without new requests to QCArchive.
[19]:
client = PortalClient("https://api.qcarchive.molssi.org:443", cache_dir=".")
import socket
def guard(*args, **kwargs):
raise Exception("I told you not to use the Internet!")
old_socket = socket.socket
socket.socket = guard
ds = client.get_dataset("singlepoint", "OpenFF BCC Refit Study COH v1.0")
erc = BasicResultCollection.from_datasets([ds], "spec_2")
assert erc.n_results == energy_result_collection.n_results
socket.socket = old_socket