Skip to content

Required Calculation Models

Overview

The RequiredCalcModels class is a subclass of CalculationRequirement that is used to check if a list of calculation models are present for an object. This is useful for features that are based on calculation models, such as the ones based on ML models.

Usage

This requirement can be instantiated with a list of calculation models that need to be present for each object. Below there is an example of how to use this requirement:

requirement = RequiredCalcModels(calc_models={"SDM1-VRN1-01": [{"model_name": "fitted_power_curve", "model_type": "fitted_power_curve"}]})

After calling check and get_data methods, the data attribute of the requirement will be a dictionary with the object as the key and nested dictionaries with the model name as the key and the files for the model as the value. Below there is an example of how this data is stored:

{
    "SDM1-VRN1-01": {
        "fitted_power_curve": {
            "model": <python object>
        }
    }
}

Database Requirements

This requirement expects that the four tables below are correctly set:

  • calculation_models: Definition of the model.
  • calculation_model_files_def: Definition of the files for the model.
  • calculation_model_files_data: Actual binary files for the model.
  • calculation_model_files_data_object_connections: Connection between the object and the model files.

At the end, the v_calculation_model_files_data view will be used to get the files for the model.

Class Definition

RequiredCalcModels(calc_models, optional=False)

Subclass of CalculationRequirement that defines the calculation models that are required for the calculation.

This will check the performance database for the existence of the required calculation models for the wanted objects.

Parameters:

  • calc_models

    (dict[str | None, list[dict[str, str | None]]]) –

    Calculation models that are required for the calculation. This should be in the format below:

    {
        object_name: [
            {
                "model_name": "calculation_model_name",
                "model_type": "calculation_model_type"
            },
            ...
        ],
        ...
    }
    

    Where:

    • object_name: str | None Name of the object for which the calculation model is required. If None, we assume the calculation model is not connected no any objects.
    • model_name: str Name of the calculation model as in performance_db. It will be treated as a regex to filter the calculation models.
    • model_type: str | None Type of the calculation model as in performance_db. If None, we assume the calculation model is not connected no any objects.
  • optional

    (bool, default: False ) –

    Set to True if this is an optional requirement. by default False

Source code in echo_energycalc/calculation_requirement_calc_models.py
def __init__(self, calc_models: dict[str | None, list[dict[str, str | None]]], optional: bool = False) -> None:
    """
    Constructor of the RequiredCalcModels class.

    This will check the performance database for the existence of the required calculation models for the wanted objects.

    Parameters
    ----------
    calc_models : dict[str | None, list[dict[str, str | None]]]
        Calculation models that are required for the calculation. This should be in the format below:

        ```python
        {
            object_name: [
                {
                    "model_name": "calculation_model_name",
                    "model_type": "calculation_model_type"
                },
                ...
            ],
            ...
        }
        ```

        Where:

        - object_name: str | None
            Name of the object for which the calculation model is required. If None, we assume the calculation model is not connected no any objects.
        - model_name: str
            Name of the calculation model as in performance_db. It will be treated as a regex to filter the calculation models.
        - model_type: str | None
            Type of the calculation model as in performance_db. If None, we assume the calculation model is not connected no any objects.
    optional : bool, optional
        Set to True if this is an optional requirement. by default False
    """
    super().__init__(optional=optional)

    calc_models_schema = {
        "description": "Keys must be the name of the objects.",
        "type": "object",
        "additionalProperties": {
            "type": "array",
            "items": {
                "anyOf": [
                    {
                        "type": "object",
                        "properties": {
                            "model_name": {
                                "type": ["string"],
                                "description": "Name of the calculation model as in performance_db. It will be treated as a regex to filter the calculation models.",
                            },
                            "model_type": {
                                "type": ["string", "null"],
                                "description": "Type of the calculation model as in performance_db",
                            },
                        },
                        "required": ["model_name", "model_type"],
                        "additionalProperties": False,
                    },
                    {
                        "type": "object",
                        "properties": {
                            "model_name": {
                                "type": ["string", "null"],
                                "description": "Name of the calculation model as in performance_db. It will be treated as a regex to filter the calculation models.",
                            },
                            "model_type": {
                                "type": ["string"],
                                "description": "Type of the calculation model as in performance_db",
                            },
                        },
                        "required": ["model_name", "model_type"],
                        "additionalProperties": False,
                    },
                ],
            },
            "minItems": 1,
        },
    }

    try:
        jsonschema.validate(calc_models, calc_models_schema)
    except jsonschema.ValidationError as e:
        raise ValueError("Invalid calc_models argument") from e

    self._calc_models: dict[str | None, list[dict[str, str | None]]] = calc_models

    # temporary directory used to store the calculation models
    self._temp_dir = tempfile.mkdtemp()

calc_models property

Calculation models that are required for the calculation.

Returns:

  • dict[str | None, list[dict[str, str | None]]]

    Calculation models that are required for the calculation.

checked property

Attribute that defines if the requirement has been checked. It's value will start as False and will be set to True after the check method is called.

Returns:

  • bool

    True if the requirement has been checked.

data property

Data required for the calculation.

Returns:

  • dict[str, dict[str, dict[str, Any]]]

    dict in the format {object_name: {calculation_model_name: {file_name: value}}}

optional property

Attribute that defines if the requirement is optional.

If optional is True, the requirement is only validated to check if it could exist, not if it is actually present. This is useful for requirements that are not necessary for all calculations, but are useful for some of them.

Returns:

  • bool

    True if the requirement is optional.

check()

Method used to check if all required calculation models are present in the database for each object.

If the requirement is not met and the calculation cannot be performed, an error is raised.

Returns:

  • bool

    Returns True if the requirement is met.

Source code in echo_energycalc/calculation_requirement_calc_models.py
def check(self) -> bool:
    """
    Method used to check if all required calculation models are present in the database for each object.

    If the requirement is not met and the calculation cannot be performed, an error is raised.

    Returns
    -------
    bool
        Returns True if the requirement is met.
    """
    if self.optional:
        return True

    # iterating each object and checking if all calculation models are present
    for object_name, calc_models in self.calc_models.items():
        # iterating each calculation model
        for calc_model in calc_models:
            # getting the required calculation model files
            calc_model_files_def = self._perfdb.calcmodels.instances.files.definitions.get(
                calcmodels=[calc_model["model_name"]] if calc_model["model_name"] is not None else None,
                calcmodel_types=[calc_model["model_type"]] if calc_model["model_type"] is not None else None,
                model_as_regex=True,
                output_type="DataFrame",
            )
            # checking if more than one calculation model was returned for the wanted filter
            if calc_model_files_def.empty:
                raise ValueError(f"No calculation model found for {calc_model=} of {object_name=}")
            if len(calc_model_files_def.index.get_level_values("calculation_model_name").unique()) > 1:
                raise ValueError(
                    f"More than one calculation model found for {calc_model=} of {object_name=}. Got {calc_model_files_def.index.get_level_values('calculation_model_name').unique().tolist()}. Keep in mind that the arguments are treated as regex",
                )

            calc_model_name = calc_model_files_def.index.get_level_values("calculation_model_name").unique()[0]

            # getting the calculation model files to check if all of them are present
            calc_model_files = self._perfdb.calcmodels.instances.files.values.get_ids(
                object_names=[object_name],
                calcmodels=[calc_model_name],
                calcmodel_types=[calc_model["model_type"]] if calc_model["model_type"] is not None else None,
                model_as_regex=True,
            )

            if len(calc_model_files) == 0:
                raise ValueError(f"No calculation model files found for {calc_model_name=} of {object_name=}")
            if missing_files := set(
                calc_model_files_def.index.get_level_values("file_name").tolist(),
            ) - set(calc_model_files[object_name][calc_model_name].keys()):
                raise ValueError(f"Missing files for {calc_model_name=} of {object_name=}: {missing_files}")

    self._checked = True

    return True

get_data(**kwargs)

Method used to get the data required for the calculation.

This will download all the files of all the required calculation models and return a dict with the model. This dict will also be available in the object property "data".

If the model does not have an associated object the key used for it will be "general".

Returns:

  • dict[str, dict[str, dict[str, Any]]]

    dict in the format {object_name: {calculation_model_name: {file_name: value}}}

Source code in echo_energycalc/calculation_requirement_calc_models.py
def get_data(self, **kwargs) -> dict[str, dict[str, dict[str, Any]]]:  # noqa: ARG002
    """
    Method used to get the data required for the calculation.

    This will download all the files of all the required calculation models and return a dict with the model.
    This dict will also be available in the object property "data".

    If the model does not have an associated object the key used for it will be "general".

    Returns
    -------
    dict[str, dict[str, dict[str, Any]]]
        dict in the format {object_name: {calculation_model_name: {file_name: value}}}
    """
    # check if requirement has been checked
    if not self._checked:
        self.check()

    # dict to store all calc models where first key is the object_name, second is the calculation_model_name, third is the file_name and the final value is the value
    # this dict will be in the format {object_name: {calculation_model_name: {file_name: value}}}
    calc_model_files = {}

    # getting all files of the calculation models
    for object_name, calc_models in self.calc_models.items():
        if object_name not in calc_model_files:
            calc_model_files[object_name] = {}
        for calc_model in calc_models:
            # getting files for calculation model
            try:
                this_calc_model_files = self._perfdb.calcmodels.instances.files.values.get(
                    object_names=[object_name],
                    calcmodels=[calc_model["model_name"]] if calc_model["model_name"] is not None else None,
                    calcmodel_types=[calc_model["model_type"]] if calc_model["model_type"] is not None else None,
                    model_as_regex=True,
                    output_type="dict",
                )
                this_calc_model_files = this_calc_model_files[object_name]

                # only getting the value for each file
                this_calc_model_files = {
                    model: {file_name: file_vals["value"] for file_name, file_vals in model_files.items()}
                    for model, model_files in this_calc_model_files.items()
                }

                calc_model_files[object_name] |= this_calc_model_files
            except Exception as e:
                if self.optional:
                    logger.exception(f"Failed to get data for optional calculation model {calc_model=}, {object_name=}.")
                else:
                    raise e

    # finally, we store the dict in the object property "data"
    self._data = copy.deepcopy(calc_model_files)

    return self.data