Skip to content

Commit

Permalink
add more explicit indication about Item-only vs shared Item/Asset MLM…
Browse files Browse the repository at this point in the history
… fields
  • Loading branch information
fmigneault committed Nov 6, 2024
1 parent 596e5d4 commit 2ff381d
Showing 1 changed file with 35 additions and 28 deletions.
63 changes: 35 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,34 +116,41 @@ The fields in the table below can be used in these parts of STAC documents:

[item-assets]: https://github.com/stac-extensions/item-assets

| Field Name | Type | Description |
|-----------------------------|---------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| mlm:name | string | **REQUIRED** A name for the model. This can include, but must be distinct, from simply naming the model architecture. If there is a publication or other published work related to the model, use the official name of the model. |
| mlm:architecture | [Model Architecture](#model-architecture) string | **REQUIRED** A generic and well established architecture name of the model. |
| mlm:tasks | \[[Task Enum](#task-enum)] | **REQUIRED** Specifies the Machine Learning tasks for which the model can be used for. If multi-tasks outputs are provided by distinct model heads, specify all available tasks under the main properties and specify respective tasks in each [Model Output Object](#model-output-object). |
| mlm:framework | string | Framework used to train the model (ex: PyTorch, TensorFlow). |
| mlm:framework_version | string | The `framework` library version. Some models require a specific version of the machine learning `framework` to run. |
| mlm:memory_size | integer | The in-memory size of the model on the accelerator during inference (bytes). |
| mlm:total_parameters | integer | Total number of model parameters, including trainable and non-trainable parameters. |
| mlm:pretrained | boolean | Indicates if the model was pretrained. If the model was pretrained, consider providing `pretrained_source` if it is known. |
| mlm:pretrained_source | string \| null | The source of the pretraining. Can refer to popular pretraining datasets by name (i.e. Imagenet) or less known datasets by URL and description. If trained from scratch (i.e.: `pretrained = false`), the `null` value should be set explicitly. |
| mlm:batch_size_suggestion | integer | A suggested batch size for the accelerator and summarized hardware. |
| mlm:accelerator | [Accelerator Type Enum](#accelerator-type-enum) \| null | The intended computational hardware that runs inference. If undefined or set to `null` explicitly, the model does not require any specific accelerator. |
| mlm:accelerator_constrained | boolean | Indicates if the intended `accelerator` is the only `accelerator` that can run inference. If undefined, it should be assumed `false`. |
| mlm:accelerator_summary | string | A high level description of the `accelerator`, such as its specific generation, or other relevant inference details. |
| mlm:accelerator_count | integer | A minimum amount of `accelerator` instances required to run the model. |
| mlm:input | \[[Model Input Object](#model-input-object)] | **REQUIRED** Describes the transformation between the EO data and the model input. |
| mlm:output | \[[Model Output Object](#model-output-object)] | **REQUIRED** Describes each model output and how to interpret it. |
| mlm:hyperparameters | [Model Hyperparameters Object](#model-hyperparameters-object) | Additional hyperparameters relevant for the model. |

To decide whether above fields should be applied under Item `properties` or under respective Assets, the context of
each field must be considered. For example, the `mlm:name` should always be provided in the Item `properties`, since
it relates to the model as a whole. In contrast, some models could support multiple `mlm:accelerator`, which could be
handled by distinct source code represented by different Assets. In such case, `mlm:accelerator` definitions should be
nested under their relevant Asset. If a field is defined both at the Item and Asset level, the value at the Asset level
would be considered for that specific Asset, and the value at the Item level would be used for other Assets that did
not override it for their respective reference. For some of the fields, further details are provided in following
sections to provide more precisions regarding some potentially ambiguous use cases.
| Field Name | Type | Description |
|-----------------------------------------|---------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| mlm:name <sup>[[1]][1]</sup> | string | **REQUIRED** A name for the model. This can include, but must be distinct, from simply naming the model architecture. If there is a publication or other published work related to the model, use the official name of the model. |
| mlm:architecture | [Model Architecture](#model-architecture) string | **REQUIRED** A generic and well established architecture name of the model. |
| mlm:tasks | \[[Task Enum](#task-enum)] | **REQUIRED** Specifies the Machine Learning tasks for which the model can be used for. If multi-tasks outputs are provided by distinct model heads, specify all available tasks under the main properties and specify respective tasks in each [Model Output Object](#model-output-object). |
| mlm:framework | string | Framework used to train the model (ex: PyTorch, TensorFlow). |
| mlm:framework_version | string | The `framework` library version. Some models require a specific version of the machine learning `framework` to run. |
| mlm:memory_size | integer | The in-memory size of the model on the accelerator during inference (bytes). |
| mlm:total_parameters | integer | Total number of model parameters, including trainable and non-trainable parameters. |
| mlm:pretrained | boolean | Indicates if the model was pretrained. If the model was pretrained, consider providing `pretrained_source` if it is known. |
| mlm:pretrained_source | string \| null | The source of the pretraining. Can refer to popular pretraining datasets by name (i.e. Imagenet) or less known datasets by URL and description. If trained from scratch (i.e.: `pretrained = false`), the `null` value should be set explicitly. |
| mlm:batch_size_suggestion | integer | A suggested batch size for the accelerator and summarized hardware. |
| mlm:accelerator | [Accelerator Type Enum](#accelerator-type-enum) \| null | The intended computational hardware that runs inference. If undefined or set to `null` explicitly, the model does not require any specific accelerator. |
| mlm:accelerator_constrained | boolean | Indicates if the intended `accelerator` is the only `accelerator` that can run inference. If undefined, it should be assumed `false`. |
| mlm:accelerator_summary | string | A high level description of the `accelerator`, such as its specific generation, or other relevant inference details. |
| mlm:accelerator_count | integer | A minimum amount of `accelerator` instances required to run the model. |
| mlm:input <sup>[[1]][1]</sup> | \[[Model Input Object](#model-input-object)] | **REQUIRED** Describes the transformation between the EO data and the model input. |
| mlm:output <sup>[[1]][1]</sup> | \[[Model Output Object](#model-output-object)] | **REQUIRED** Describes each model output and how to interpret it. |
| mlm:hyperparameters <sup>[[1]][1]</sup> | [Model Hyperparameters Object](#model-hyperparameters-object) | Additional hyperparameters relevant for the model. |

[1]: #sup1sup-allowed-only-in-item-properties

##### <sup>[1]</sup> Allowed Only in Item `properties`

> [!NOTE]
> Unless stated otherwise by <sup>[[1]][1]</sup> in the table, fields can be used at either the Item or Asset level.
> <br><br>
> To decide whether above fields should be applied under Item `properties` or under respective Assets, the context of
> each field must be considered. For example, the `mlm:name` should always be provided in the Item `properties`, since
> it relates to the model as a whole. In contrast, some models could support multiple `mlm:accelerator`, which could be
> handled by distinct source code represented by different Assets. In such case, `mlm:accelerator` definitions should be
> nested under their relevant Asset. If a field is defined both at the Item and Asset level, the value at the Asset
> level would be considered for that specific Asset, and the value at the Item level would be used for other Assets that
> did not override it for their respective reference. For some of the fields, further details are provided in following
> sections to provide more precisions regarding some potentially ambiguous use cases.
In addition, fields from the multiple relevant extensions should be defined as applicable. See
[Best Practices - Recommended Extensions to Compose with the ML Model Extension](best-practices.md#recommended-extensions-to-compose-with-the-ml-model-extension)
Expand Down

0 comments on commit 2ff381d

Please sign in to comment.