fix(base_quantity): Removed string return from serialization of pint …

…quantity. (#28) List of changes: - Removed str(input_value.magnitude) to return a proper number instead of a string when calling `.model_dump()` - Added a specific serialization option when calling `.model_dump(mode="json")` or `.model_dump_json()` to return a string version of the field. This is only import for pint quantities, such that it returns something like `10 kV` that is easily deserialized with pint, - Added better type check for instances of the BaseQuantity used on the pydantic model, - Relaxed the schema serialization since pint will handle most of it, - Added better typehint for pydantic classmethods
NREL · Jun 5, 2024 · c2f6b70 · c2f6b70
commit c2f6b70
Show file tree

Hide file tree

Showing 74 changed files with 14,308 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: 6af4bc603a8056b95636de2aea193ceb
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.nojekyll b/.nojekyll
diff --git a/_sources/explanation/components.md.txt b/_sources/explanation/components.md.txt
@@ -0,0 +1,45 @@
+```{eval-rst}
+.. _components-page:
+```
+# Components
+A component is any element that is attached to a system.
+
+All components are required to define a name as a string (it is required in the base class). This
+may not be appropriate for all classes. The `Location` class in this package is one example. In
+cases like that developers can define their own name field and set its default value to `""`.
+
+Refer to the [Components API](#components-api) for more information.
+
+## Inheritance
+Recommended rule: A `Component` that has subclasses should never be directly instantiated.
+
+Consider a scenario where a developer defines a `Load` class and then later decides a new load is
+needed because of one custom field.
+
+The temptation may be to create `CustomLoad(Load)`. This is very problematic in the design of
+the infrasys API. There will be no way to retrieve only `Load` instances. Consider this example:
+
+```python
+for load in system.get_components(Load)
+    print(load.name)
+```
+
+This will retrieve both `Load` and `CustomLoad` instances.
+
+Instead, our recommendation is to create a base class with the common fields.
+
+```python
+class LoadBase(Component)
+    """Defines common fields for all Loads."""
+
+    common_field1: float
+    common_field2: float
+
+class Load(LoadBase):
+    """A load component"""
+
+class CustomLoad(LoadBase):
+    """A custom load component"""
+
+    custom_field: float
+```
diff --git a/_sources/explanation/index.md.txt b/_sources/explanation/index.md.txt
@@ -0,0 +1,16 @@
+```{eval-rst}
+.. _explanation-page:
+```
+# Explanation
+
+```{eval-rst}
+.. toctree::
+    :maxdepth: 2
+    :caption: Contents:
+
+    system
+    components
+    time_series
+    location
+    serialization
+```
diff --git a/_sources/explanation/location.md.txt b/_sources/explanation/location.md.txt
@@ -0,0 +1,4 @@
+# Location
+Components can compose this class in order to specify its geographic location.
+
+Refer to the [Location API](#location-api) for more information.
diff --git a/_sources/explanation/serialization.md.txt b/_sources/explanation/serialization.md.txt
@@ -0,0 +1,122 @@
+# Serialization
+This page describes how `infrasys` serializes a system and its components to JSON when a user calls
+`System.to_json()` and `System.from_json()`.
+
+## Components
+`infrasys` converts its nested dictionaries of components-by-type into a flat array. Each component
+records metadata about its actual Python type into a field called `__metadata__`. Here is an example
+of a serialized `Location` object. Note that it includes the module and type. `infrasys` uses this
+information during de-serialization to dynamically import the type and construct it. This allows
+serialization to work with types defined outside of `infrasys` as long as the user has imported
+those types.
+
+```json
+{
+  "uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26",
+  "name": null,
+  "x": 0.0,
+  "y": 0.0,
+  "crs": null,
+  "__metadata__": {
+    "fields": {
+      "module": "infrasys.location",
+      "type": "Location",
+      "serialized_type": "base"
+    }
+  }
+},
+```
+
+### Composed components
+There are many cases where one component will contain an instance of another component. For example,
+a `Bus` may contain a `Location` or a `Generator` may contain a `Bus`. When serializing each
+component, `infrasys` checks the type of each of that component's fields. If a value is another
+component (which means that it must also be attached to system), `infrasys` replaces that instance
+with its UUID. It does this to avoid duplicating data in the JSON file.
+
+Here is an example of a serialized `Bus`. Note the value for the `coordinates` field. It contains the
+type and UUID of the actual `coordinates`. During de-serialization, `infrasys` will detect this
+condition and only attempt to de-serialize the bus once all `Location` instances have been
+de-serialized.
+
+```json
+{
+  "uuid": "e503984a-3285-43b6-84c2-805eb3889210",
+  "name": "bus1",
+  "voltage": 1.1,
+  "coordinates": {
+    "__metadata__": {
+      "fields": {
+        "module": "infrasys.location",
+        "type": "Location",
+        "serialized_type": "composed_component",
+        "uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26"
+      }
+    }
+  },
+  "__type_metadata__": {
+    "fields": {
+      "module": "tests.models.simple_system",
+      "type": "SimpleBus",
+      "serialized_type": "base"
+    }
+  }
+},
+```
+
+#### Denormalized component data
+There are cases where users may prefer to have the full, denormalized JSON data for a component.
+All components are of type `pydantic.BaseModel` and so implement the method `model_dump_json`.
+
+Here is an example of a bus serialized that way (`bus.model_dump_json(indent=2)`):
+
+```json
+{
+  "uuid": "e503984a-3285-43b6-84c2-805eb3889210",
+  "name": "bus1",
+  "voltage": 1.1,
+  "coordinates": {
+    "uuid": "1e5f90ae-a386-4c8a-89ae-0ed123da3e26",
+    "name": null,
+    "x": 0.0,
+    "y": 0.0,
+    "crs": null
+  }
+}
+```
+
+### Pint Quantities
+`infrasys` encodes metadata into component JSON when that component contains a `pint.Quantity`
+instance. Here is an example of such a component:
+
+```json
+{
+  "uuid": "711d2724-5814-4e0e-be5f-4b0b825b7f07",
+  "name": "test",
+  "distance": {
+    "value": 2,
+    "units": "meter",
+    "__metadata__": {
+      "fields": {
+        "module": "infrasys.quantities",
+        "type": "Distance",
+        "serialized_type": "quantity"
+      }
+    }
+  },
+  "__metadata__": {
+    "fields": {
+      "module": "tests.test_serialization",
+      "type": "ComponentWithPintQuantity",
+      "serialized_type": "base"
+    }
+  }
+}
+```
+
+## Time Series
+If the user stores time series data in Arrow files (default behavior), then `infrasys` will copy
+the Arrow files into the user-specified directory in `system.to_json()`.
+
+If the user instead chose to store time series in memory then `infrasys` will series that data
+into Arrow files in the user-specified directory in `system.to_json()`.
diff --git a/_sources/explanation/system.md.txt b/_sources/explanation/system.md.txt
@@ -0,0 +1,60 @@
+# System
+The System class provides a data store for components and time series data.
+
+Refer to the [System API](#system-api) for complete information.
+
+## Items to consider for parent packages
+
+### Composition vs Inheritance
+Parent packages must choose one of the following:
+
+1. Derive a custom System class that inherits from `infrasys.System`. Re-implement methods
+as desired. Add custom attributes to the System that will be serialized to JSON.
+
+    - Reimplement `System.add_components` in order to perform custom validation or custom behavior.
+      This is only needed for validation that needs information from both the system and the
+      component. Note that the `System` constructor provides the keyword argument
+      `auto_add_composed_components` that dictates how to handle the condition where a component
+      contains another component which is not already attached to the system.
+
+    - Reimplement `System.serialize_system_attributes` and `System.deserialize_system_attributes`.
+      `infrasys` will call those methods during `to_json` and `from_json` and serialize/de-serialize
+      the contents.
+
+    - Reimplement `System.data_format_version` and `System.handle_data_format_upgrade`. `infrasys`
+      will call the upgrade function if it detects a version change during de-serialization.
+
+2. Implement an independent System class and compose the `infrasys.System`. This can be beneficial
+if you want to make the underlying system opaque to users.
+
+    - This pattern requires that you call `System.to_json()` with the keyword argument `data` set
+      to a dictionary containing your system's attributes. `infrasys` will add its contents to a
+      field called `system` inside that dictionary.
+
+3. Use `infrasys.System` directly. This is probably not what most packages want because they will
+not be able to serialize custom attributes or implement specialized behavior as discussed above.
+
+### Units
+`infrasys` uses the [pint library](https://pint.readthedocs.io/en/stable/) to help manage units.
+Package developers should consider storing fields that are quantities as subtypes of
+[Base.Quantity](#base-quantity-api). Pint performs unit conversion automatically when performing
+arithmetic.
+
+If you want to be able to generate JSON schema for a model that contains a Pint quantity, you must
+add an annotation as shown below. Otherwise, Pydantic will raise an exception.
+
+```python
+from pydantic import WithJsonSchema
+from infrasys import Component
+
+class ComponentWithPintQuantity(Component):
+
+    distance: Annotated[Distance, WithJsonSchema({"type": "string"})]
+
+Component.model_json_schema()
+```
+
+**Notes**:
+- `infrasys` includes some basic quantities in [infrasys.quantities](#quantity-api).
+- Pint will automatically convert a list or list of lists of values into a `numpy.ndarray`.
+infrasys will handle serialization/de-serialization of these types.
diff --git a/_sources/explanation/time_series.md.txt b/_sources/explanation/time_series.md.txt
@@ -0,0 +1,48 @@
+# Time Series
+Infrastructure systems supports time series data expressed as a one-dimensional array of floats
+using the class [SingleTimeSeries](#singe-time-series-api). Users must provide a `variable_name`
+that is typically the field of a component being modeled. For example, if the user has a time array
+associated with the active power of a generator, they would assign
+`variable_name = "active_power"`.
+
+Here is an example of how to create an instance of `SingleTimeSeries`:
+
+```python
+    import random
+    time_series = SingleTimeSeries.from_array(
+        data=[random.random() for x in range(24)],
+        variable_name="active_power",
+        initial_time=datetime(year=2030, month=1, day=1),
+        resolution=timedelta(hours=1),
+    )
+```
+
+Users can attach their own attributes to each time array. For example,
+there might be different profiles for different scenarios or model years.
+
+```python
+    time_series = SingleTimeSeries.from_array(
+        data=[random.random() for x in range(24)],
+        variable_name="active_power",
+        initial_time=datetime(year=2030, month=1, day=1),
+        resolution=timedelta(hours=1),
+        scenario="high",
+        model_year="2035",
+    )
+```
+
+## Behaviors
+Users can customize time series behavior with these flags passed to the `System` constructor:
+
+- `time_series_in_memory`: The `System` stores each array of data in an Arrow file by default. This
+is a binary file that enables efficient storage and row access. Set this flag to store the data in
+memory instead.
+- `time_series_read_only`: The default behavior allows users to add and remove time series data.
+Set this flag to disable mutation. That can be useful if you are de-serializing a system, won't be
+changing it, and want to avoid copying the data.
+- `time_series_directory`: The `System` stores time series data on the computer's tmp filesystem by
+default. This filesystem may be of limited size. If your data will exceed that limit, such as what
+is likely to happen on an HPC compute node, set this parameter to an alternate location (such as
+`/tmp/scratch` on NREL's HPC systems).
+
+Refer to the [Time Series API](#time-series-api) for more information.
diff --git a/_sources/how_tos/index.md.txt b/_sources/how_tos/index.md.txt
@@ -0,0 +1,12 @@
+```{eval-rst}
+.. _how-tos-page:
+```
+# How Tos
+
+```{eval-rst}
+.. toctree::
+    :maxdepth: 2
+    :caption: Contents:
+
+    list_time_series
+```
diff --git a/_sources/how_tos/list_time_series.md.txt b/_sources/how_tos/list_time_series.md.txt
@@ -0,0 +1,63 @@
+# How to list existing time series data
+
+Suppose that you have added multiple time series arrays to your components using differing
+names and attributes. How can you see what is present?
+
+This example assumes that a system with two generators and time series data has been serialized
+to a file.
+
+```python
+from infrasys import Component, System
+
+system = System.from_json("system.json")
+for component in system.get_components(Component):
+    for metadata in system.list_time_series_metadata(component):
+        print(f"{component.label}: {metadata.label} {metadata.user_attributes}")
+
+Generator.gen1: SingleTimeSeries.active_power {'scenario': 'high', 'model_year': '2030'}
+Generator.gen1: SingleTimeSeries.active_power {'scenario': 'high', 'model_year': '2035'}
+Generator.gen1: SingleTimeSeries.active_power {'scenario': 'low', 'model_year': '2030'}
+Generator.gen1: SingleTimeSeries.active_power {'scenario': 'low', 'model_year': '2035'}
+Generator.gen1: SingleTimeSeries.reactive_power {'scenario': 'high', 'model_year': '2030'}
+Generator.gen1: SingleTimeSeries.reactive_power {'scenario': 'high', 'model_year': '2035'}
+Generator.gen1: SingleTimeSeries.reactive_power {'scenario': 'low', 'model_year': '2030'}
+Generator.gen1: SingleTimeSeries.reactive_power {'scenario': 'low', 'model_year': '2035'}
+Generator.gen2: SingleTimeSeries.active_power {'scenario': 'high', 'model_year': '2030'}
+Generator.gen2: SingleTimeSeries.active_power {'scenario': 'high', 'model_year': '2035'}
+Generator.gen2: SingleTimeSeries.active_power {'scenario': 'low', 'model_year': '2030'}
+Generator.gen2: SingleTimeSeries.active_power {'scenario': 'low', 'model_year': '2035'}
+Generator.gen2: SingleTimeSeries.reactive_power {'scenario': 'high', 'model_year': '2030'}
+Generator.gen2: SingleTimeSeries.reactive_power {'scenario': 'high', 'model_year': '2035'}
+Generator.gen2: SingleTimeSeries.reactive_power {'scenario': 'low', 'model_year': '2030'}
+Generator.gen2: SingleTimeSeries.reactive_power {'scenario': 'low', 'model_year': '2035'}
+```
+
+Now you can retrieve the exact instance you want.
+
+```python
+system.time_series.get(gen1, variable_name="active_power", scenario="high", model_year="2035").data
+<pyarrow.lib.Int64Array object at 0x107a38d60>
+[
+  0,
+  1,
+  2,
+  3,
+  4,
+  5,
+  6,
+  7,
+  8,
+  9,
+  ...
+  8774,
+  8775,
+  8776,
+  8777,
+  8778,
+  8779,
+  8780,
+  8781,
+  8782,
+  8783
+]
+```