Annotations

Note

This document was written in November 2024 assuming

  • Python 3.10+

  • Pydantic V2 (not the V1 API backdoor)

  • Mypy 1.10+ or similarly-functional Python type-checker

  • Interchange 0.4+

Consider a class to store information about atoms. It needs to store an atomic mass, a partial charge, and a position in 3-D space. The following class would work fine:

class Atom:
    mass: float
    charge: float
    position: list

but this has some shortcomings:

  • What units are assumed for each attribute?

  • What if the position is a numpy.ndarray (or openmm.Vec3)?

  • Can instances of this class be serialized to disk? And then de-serialized?

  • Are any fields optional?

OpenFF’s combined use of Pint and Pydantic solves these problems. To leverage these tools,

  1. Classes should subclass out of openff.interchange.pydantic._BaseModel, which is a thin wrapper around pydantic.BaseModel

  2. Annotations should use the openff.interchange._annotations._*Quantity types, which are thin wrappers around openff.units.Quantity, itself having the same behavior as pint.Quantity, or annotations derived thereof.

The same class can be rewritten:

from openff.interchange.pydantic import _BaseModel
from openff.interchange._annotations import _Quantity
from openff.units import Quantity


class Atom(_BaseModel):
    mass: _Quantity
    charge: _Quantity
    position: _Quantity | None = None


carbon = Atom(
    mass=_Quantity("12.011 amu"),
    charge=_Quantity("-0.1 elementary_charge"),
    position=_Quantity([0, 0, 0], "nanometer"),
)

print(carbon.model_dump())
# {'mass': {'val': 12.011, 'unit': 'unified_atomic_mass_unit'}, 'charge': {'val': -0.1, 'unit': 'elementary_charge'}, 'position': {'val': [0, 0, 0], 'unit': 'nanometer'}}

print(carbon.model_dump_json())
# {"mass":{"val":12.011,"unit":"unified_atomic_mass_unit"},"charge":{"val":-0.1,"unit":"elementary_charge"},"position":{"val":[0,0,0],"unit":"nanometer"}}

print(Atom.model_validate_json(carbon.model_dump_json()))
# Atom(mass=<Quantity(12.011, 'unified_atomic_mass_unit')>, charge=<Quantity(-0.1, 'elementary_charge')>, position=<Quantity([0 0 0], 'nanometer')>)

This is an improvement, but units are still implicit. These annotations can be more specific, optionally defining the dimensionality or the specific unit. The _annotations module defines a few of these out of the box for convenience, but more can be defined by the user using the typing.Annotated pattern in the source code.

from openff.interchange._annotations import (
    _ElementaryChargeQuantity,
    _DistanceQuantity,
    _MassQuantity,
)


class Atom(_BaseModel):
    mass: _MassQuantity
    charge: _ElementaryChargeQuantity
    position: _DistanceQuantity | None = None

With the use of _DistanceQuantity, there is a dimensionality check during validation that prevents incompatible units from being passed (compatible units will checked, but not converted):

try:
    carbon = Atom(
        mass=Quantity("12.011 amu"),
        charge=Quantity("-0.1 elementary_charge"),
        position=Quantity([1, 2, 3], "kJ/mol"),
    )
except ValidationError as error:
    print(str(error))
# 1 validation error for Atom
# position
#   Value error, Dimensionality of quantity=<Quantity([1 2 3], 'kilojoule / mole')> is not compatible with unit='nanometer' [type=value_error, input_value=<Quantity([1 2 3], 'kilojoule / mole')>, input_type=Quantity]
#     For further information visit https://errors.pydantic.dev/2.9/v/value_error

The charge field was defined with an annotation that specifies not only dimensionality, but the specific unit. In this case, inputs are validated and converted to the specified units:

print(
    Atom(
        mass=Quantity(12.011, "amu"),
        charge=Quantity("-0.1 / 6.241509074460763e+18 coulomb"),
    ).charge
)
# -0.10000000000000002 elementary_charge

This can be useful if i.e. it is important that positions are universally stored with Angstroms and not nanometers.

Note that this Annotated trick is specifically used by Pydantic classes, and will cause an error at runtime when violated - static type checkers like MyPy and PyRight will not be able to validate that units and dimensionality are correct.