PhysicalPropertyDataSet¶
-
class
openff.evaluator.datasets.
PhysicalPropertyDataSet
[source]¶ An object for storing and curating data sets of both physical property measurements and estimated. This class defines a number of convenience functions for filtering out unwanted properties, and for generating general statistics (such as the number of properties per substance) about the set.
Methods
__init__
()Constructs a new PhysicalPropertyDataSet object.
add_properties
(*physical_properties[, validate])Adds a physical property to the data set.
filter_by_components
(number_of_components)Filter the data set based on the number of components present in the substance the data points were collected for.
filter_by_elements
(*allowed_elements)Filters out those properties which were estimated for
filter_by_function
(filter_function)Filter the data set using a given filter function.
filter_by_phases
(phases)Filter the data set based on the phase of the property (e.g liquid).
filter_by_pressure
(min_pressure, max_pressure)Filter the data set based on a minimum and maximum pressure.
filter_by_property_types
(*property_types)Filter the data set based on the type of property (e.g Density).
filter_by_smiles
(*allowed_smiles)Filters out those properties which were estimated for
filter_by_temperature
(min_temperature, …)Filter the data set based on a minimum and maximum temperature.
Filters out those properties which don’t have their uncertainties reported.
from_json
(file_path)Create this object from a JSON file.
json
([file_path, format])Creates a JSON representation of this class.
merge
(data_set[, validate])Merge another data set into the current one.
parse_json
(string_contents[, encoding])Parses a typed json string into the corresponding class structure.
properties_by_substance
(substance)A generator which may be used to loop over all of the properties which were measured for a particular substance.
properties_by_type
(property_type)A generator which may be used to loop over all of properties of a particular type, e.g.
Converts a PhysicalPropertyDataSet to a pandas.DataFrame object with columns of
validate
()Checks to ensure that all properties within the set are valid physical property object.
Attributes
A list of all of the properties within this set.
The types of property within this data set.
The sources from which the properties in this data set were gathered.
The substances for which the properties in this data set were collected for.
-
property
properties
¶ A list of all of the properties within this set.
- Type
tuple of PhysicalProperty
-
property
property_types
¶ The types of property within this data set.
- Type
set of str
-
property
substances
¶ The substances for which the properties in this data set were collected for.
- Type
set of Substance
-
property
sources
¶ The sources from which the properties in this data set were gathered.
- Type
set of Source
-
merge
(data_set, validate=True)[source]¶ Merge another data set into the current one.
- Parameters
data_set (PhysicalPropertyDataSet) – The secondary data set to merge into this one.
validate (bool) – Whether to validate the other data set before merging.
-
add_properties
(*physical_properties, validate=True)[source]¶ Adds a physical property to the data set.
- Parameters
physical_properties (PhysicalProperty) – The physical property to add.
validate (bool) – Whether to validate the properties before adding them to the set.
-
properties_by_substance
(substance)[source]¶ A generator which may be used to loop over all of the properties which were measured for a particular substance.
- Parameters
substance (Substance) – The substance of interest.
- Returns
- Return type
generator of PhysicalProperty
-
properties_by_type
(property_type)[source]¶ A generator which may be used to loop over all of properties of a particular type, e.g. all “Density” properties.
- Parameters
property_type (str or type of PhysicalProperty) – The type of property of interest. This may either be the string class name of the property or the class type.
- Returns
- Return type
generator of PhysicalProperty
-
filter_by_function
(filter_function)[source]¶ Filter the data set using a given filter function.
- Parameters
filter_function (lambda) – The filter function.
-
filter_by_property_types
(*property_types)[source]¶ Filter the data set based on the type of property (e.g Density).
- Parameters
property_types (PropertyType or str) – The type of property which should be retained.
Examples
Filter the dataset to only contain densities and static dielectric constants
>>> # Load in the data set of properties which will be used for comparisons >>> from openff.evaluator.datasets.thermoml import ThermoMLDataSet >>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001') >>> >>> # Filter the dataset to only include densities and dielectric constants. >>> from openff.evaluator.properties import Density, DielectricConstant >>> data_set.filter_by_property_types(Density, DielectricConstant)
or
>>> data_set.filter_by_property_types('Density', 'DielectricConstant')
-
filter_by_phases
(phases)[source]¶ Filter the data set based on the phase of the property (e.g liquid).
- Parameters
phases (PropertyPhase) – The phase of property which should be retained.
Examples
Filter the dataset to only include liquid properties.
>>> # Load in the data set of properties which will be used for comparisons >>> from openff.evaluator.datasets.thermoml import ThermoMLDataSet >>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001') >>> >>> from openff.evaluator.datasets import PropertyPhase >>> data_set.filter_by_temperature(PropertyPhase.Liquid)
-
filter_by_temperature
(min_temperature, max_temperature)[source]¶ Filter the data set based on a minimum and maximum temperature.
- Parameters
min_temperature (pint.Quantity) – The minimum temperature.
max_temperature (pint.Quantity) – The maximum temperature.
Examples
Filter the dataset to only include properties measured between 130-260 K.
>>> # Load in the data set of properties which will be used for comparisons >>> from openff.evaluator.datasets.thermoml import ThermoMLDataSet >>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001') >>> >>> from openff.evaluator import unit >>> data_set.filter_by_temperature(min_temperature=130*unit.kelvin, max_temperature=260*unit.kelvin)
-
filter_by_pressure
(min_pressure, max_pressure)[source]¶ Filter the data set based on a minimum and maximum pressure.
- Parameters
min_pressure (pint.Quantity) – The minimum pressure.
max_pressure (pint.Quantity) – The maximum pressure.
Examples
Filter the dataset to only include properties measured between 70-150 kPa.
>>> # Load in the data set of properties which will be used for comparisons >>> from openff.evaluator.datasets.thermoml import ThermoMLDataSet >>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001') >>> >>> from openff.evaluator import unit >>> data_set.filter_by_temperature(min_pressure=70*unit.kilopascal, max_temperature=150*unit.kilopascal)
-
filter_by_components
(number_of_components)[source]¶ Filter the data set based on the number of components present in the substance the data points were collected for.
- Parameters
number_of_components (int) – The allowed number of components in the mixture.
Examples
Filter the dataset to only include pure substance properties.
>>> # Load in the data set of properties which will be used for comparisons >>> from openff.evaluator.datasets.thermoml import ThermoMLDataSet >>> data_set = ThermoMLDataSet.from_doi('10.1016/j.jct.2016.10.001') >>> >>> data_set.filter_by_components(number_of_components=1)
-
filter_by_elements
(*allowed_elements)[source]¶ - Filters out those properties which were estimated for
compounds which contain elements outside of those defined in allowed_elements.
- Parameters
allowed_elements (str) – The symbols (e.g. C, H, Cl) of the elements to retain.
-
filter_by_smiles
(*allowed_smiles)[source]¶ - Filters out those properties which were estimated for
compounds which do not appear in the allowed smiles list.
- Parameters
allowed_smiles (str) – The smiles identifiers of the compounds to keep after filtering.
-
filter_by_uncertainties
()[source]¶ Filters out those properties which don’t have their uncertainties reported.
-
validate
()[source]¶ Checks to ensure that all properties within the set are valid physical property object.
-
classmethod
from_json
(file_path)¶ Create this object from a JSON file.
- Parameters
file_path (str) – The path to load the JSON from.
- Returns
The parsed class.
- Return type
cls
-
json
(file_path=None, format=False)¶ Creates a JSON representation of this class.
-
classmethod
parse_json
(string_contents, encoding='utf8')¶ Parses a typed json string into the corresponding class structure.
-
to_pandas
()[source]¶ Converts a PhysicalPropertyDataSet to a pandas.DataFrame object with columns of
‘Id’
‘Temperature (K)’
‘Pressure (kPa)’
‘Phase’
‘N Components’
‘Component 1’
‘Role 1’
‘Mole Fraction 1’
‘Exact Amount 1’
…
‘Component N’
‘Role N’
‘Mole Fraction N’
‘Exact Amount N’
‘<Property 1> Value (<default unit>)’
‘<Property 1> Uncertainty / (<default unit>)’
…
‘<Property N> Value / (<default unit>)’
‘<Property N> Uncertainty / (<default unit>)’
‘Source’
where ‘Component X’ is a column containing the smiles representation of component X.
- Returns
The create data frame.
- Return type
-
property