Skip to content

Solar Module Temperature Loss

Overview

The SolarEnergyLossModuleTemperature class is a subclass of SolarEnergyLossCalculator that calculates the energy loss due to module temperature effects using a pre-trained predictive model. This model is loaded from the database and expects as input the average module temperature from the associated weather stations. The prediction model is calculated with hourly data, but the final calculation is performed daily, and the result is done by multiplying the model output (temperature factor) by the ActivePowerTheoretical from the SPE on each timestamp.

Calculation Logic

The calculation works as follows:

  1. Retrieve module temperature data (ModuleTempCommOk_5min.AVG) from all simple weather stations associated with the object. Also retrive theoretical power data (ActivePowerTheoretical_10min.AVG) from the SPE object.
  2. Average the module temperature across all available weather stations for each timestamp.
  3. Identify and keep only complete days (days with 24 hourly values).
  4. Resample the data to hourly frequency, filling missing values with 25°C.
  5. Use the pre-trained model to predict the temperature loss factor/percentage for each hour of the complete days.
  6. Multiply the temperature loss factor by the ActivePowerTheoretical on each hourly timestamp.
  7. Clip negative values to zero (no gain is considered)
  8. Aggregate the hourly losses to daily totals resulting in kWh daily loss.

Database Requirements

  • Feature attribute feature_options_json with the following keys:
    • calc_model_type: Type of the calculation model to use (e.g., 'modtemperature_fit').
    • model_name: Name of the model to use for the calculation.
    • bazefield_features: Boolean indicating if the features are sourced from Bazefield.
  • The following object attributes for the object being calculated:
    • reference_weather_stations: Dictionary indicating which simple weather stations to use (e.g., { "simple_ws": "RBG-RBG2-MET1" }).
  • The calculation model must be available in the database and match the type and name specified in feature_options_json.

Class Definition

SolarEnergyLossModuleTemperature(object_name, feature)

Base class for solar energy loss/gain from Irradiance.

For this class to work, the feature must have the attribute feature_options_json with the following keys:

  • 'calc_model_type': type of the model that will be used to calculate the feature. It must match the type of the model in performance_db.
  • 'model_name': name of the model that will be used to calculate the feature.
  • 'bazefield_features': bool indicating if the required features needs to be acquired from bazefield.

Parameters:

  • object_name

    (str) –

    Name of the object for which the feature is calculated. It must exist in performance_db.

  • feature

    (str) –

    Feature of the object that is calculated. It must exist in performance_db.

Source code in echo_energycalc/solar_energy_loss_mod_temperature.py
def __init__(self, object_name: str, feature: str) -> None:
    """
    Class used to calculate features that depend on a PredictiveModel.

    For this class to work, the feature must have the attribute `feature_options_json` with the following keys:

    - 'calc_model_type': type of the model that will be used to calculate the feature. It must match the type of the model in performance_db.
    - 'model_name': name of the model that will be used to calculate the feature.
    - 'bazefield_features': bool indicating if the required features needs to be acquired from bazefield.

    Parameters
    ----------
    object_name : str
        Name of the object for which the feature is calculated. It must exist in performance_db.
    feature : str
        Feature of the object that is calculated. It must exist in performance_db.
    """
    # initialize parent class
    super().__init__(object_name, feature)

    self._add_requirement(RequiredFeatureAttributes(self.object, self.feature, ["feature_options_json"]))

    self._get_required_data()

    self._feature_attributes = self._get_requirement_data("RequiredFeatureAttributes")[self.feature]

    self._validate_feature_options()

    self._add_requirement(
        RequiredCalcModels(
            calc_models={
                self.object: [
                    {
                        "model_name": f".*{self._feature_attributes['feature_options_json']['model_name']}.*",
                        "model_type": f"^{self._feature_attributes['feature_options_json']['calc_model_type']}$",
                    },
                ],
            },
        ),
    )
    self._add_requirement(
        RequiredObjectAttributes(
            {
                self.object: [
                    "reference_weather_stations",
                    "latitude",
                    "longitude",
                ],
            },
        ),
    )
    self._get_required_data()

    # getting the model name
    self._model_name = next(iter(self._get_requirement_data("RequiredCalcModels")[self.object].keys()))

    # loading calculation model from file
    self._model = self._get_requirement_data("RequiredCalcModels")[self.object][self._model_name]["model"]

    # Deserializing the model from base64
    if self._model is None:
        raise ValueError(
            f"Model {self._model_name} not found for object {self.object}. Please check the configuration in the database.",
        )
    model_b64_loaded = self._model["model"]
    with BytesIO(pybase64.b64decode(model_b64_loaded)) as buffer:
        buffer.seek(0)
        self._model = joblib.load(buffer)

    # defining required features
    simple_ws = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["simple_ws"]
    features = {ws: ["ModuleTempCommOk_5min.AVG"] for ws in simple_ws}
    features[self.object] = ["ActivePowerTheoretical_10min.AVG"]

    # Adding suffix _b# to features if bazefield_features is True
    if self._feature_attributes["feature_options_json"].get("bazefield_features", False):
        features = {obj: [f"{feat}_b#" for feat in feats] for obj, feats in features.items()}
    self._add_requirement(RequiredFeatures(features=features))

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • Series | DataFrame | None:

    Result of the calculation if the method "calculate" was called. None otherwise.

calculate(period, save_into=None, cached_data=None, **kwargs)

Method that will calculate the feature.

This code will do the following: 1. Get module temperature data from the weather stations associated with the object. 2. Average the module temperature data from all weather stations. 3. Resample the data to hourly frequency, keeping only complete days (24 hours of data). 4. Predict the temperature loss using the model and clip negative values, as there is no gain associated with module temperature. The model will return a loss percentage that will be multiplied by the theoretical power. The result is the final loss in kW. 5. Resample data to daily frequency resulting in kWh/day values.

Parameters:

  • period

    (DateTimeRange) –

    Period for which the feature will be calculated.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • cached_data

    (DataFrame | None, default: None ) –

    DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None

  • **kwargs

    (dict, default: {} ) –

    Additional arguments that will be passed to the "save" method.

Returns:

  • Series

    Pandas Series with the calculated feature.

Source code in echo_energycalc/solar_energy_loss_mod_temperature.py
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: DataFrame | None = None,
    **kwargs,
) -> Series:
    """
    Method that will calculate the feature.

    This code will do the following:
    1. Get module temperature data from the weather stations associated with the object.
    2. Average the module temperature data from all weather stations.
    3. Resample the data to hourly frequency, keeping only complete days (24 hours of data).
    4. Predict the temperature loss using the model and clip negative values, as there is no gain associated with module temperature. The model will return a loss percentage that will be multiplied by the theoretical power. The result is the final loss in kW.
    5. Resample data to daily frequency resulting in kWh/day values.

    Parameters
    ----------
    period : DateTimeRange
        Period for which the feature will be calculated.
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    cached_data : DataFrame | None, optional
        DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
        By default None
    **kwargs : dict, optional
        Additional arguments that will be passed to the "save" method.

    Returns
    -------
    Series
        Pandas Series with the calculated feature.
    """
    t0 = perf_counter()

    # getting feature values
    self._get_required_data(
        period=period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )

    # getting DataFrame with feature values
    df = self._get_requirement_data("RequiredFeatures")

    t1 = perf_counter()

    # --------------- Adjusting Dataframe structure
    # Averaging the values for the features
    df[("AVG", "ModuleTempCommOk_5min.AVG")] = df.loc[:, (slice(None), "ModuleTempCommOk_5min.AVG_b#")].mean(axis=1)
    df = df.loc[:, df.columns.get_level_values("object").isin(["AVG", self.object])]
    df.columns = df.columns.droplevel(0)
    # Remove the suffix _b# from the columns
    df.columns = df.columns.str.replace("_b#$", "", regex=True)
    # Adjusting temperature values to 25 if NaN
    df["ModuleTempCommOk_5min.AVG"] = df["ModuleTempCommOk_5min.AVG"].fillna(25)
    # Renaming columns to match the model input
    df = df.rename(
        columns={
            "ModuleTempCommOk_5min.AVG": "TArray",
        },
    )

    # ------------ Keeping only days with 24 hours of data
    # Resampling the dataframe to hour frequency
    df_hourly = df.resample("h").mean()
    # Getting only full days
    complete_days = df_hourly.resample("D").size().loc[lambda x: x == 24].index
    df_complete_days = df_hourly[df_hourly.index.normalize().isin(complete_days)]

    # Logging discarded days due to incomplete data
    discarded_days = set(df.index.normalize()) - set(df_complete_days.index.normalize())
    if discarded_days:
        logger.warning(
            f"{self.object} - {self.feature} - {period}: Discarded days due to less than 24 hours of data: {', '.join(str(day.date()) for day in discarded_days)}",
        )

    t2 = perf_counter()

    # ------------- Applying model to predict temperature loss
    if not df_complete_days.empty:
        x = df_complete_days[["TArray"]]
        # Predicting feature values
        model_result = self._model.predict(x)
        df_complete_days["model_result"] = model_result
        df_complete_days["Temp_Loss_10min.AVG"] = (
            df_complete_days["ActivePowerTheoretical_10min.AVG"] * df_complete_days["model_result"]
        )
        result_hourly = df_complete_days["Temp_Loss_10min.AVG"]
        result_hourly = result_hourly.clip(lower=0)

    # ----------- Adjusting NaN values
    # -------------During night, loss is 0
    # Getting timestamps and converting to UTC
    timestamps = result_hourly.index
    # adding 3 hours to convert to UTC
    times_pd = timestamps + Timedelta(hours=3)
    solar_position = pvlib.solarposition.get_solarposition(
        time=times_pd,
        latitude=self._get_requirement_data("RequiredObjectAttributes")[self.object]["latitude"],
        longitude=self._get_requirement_data("RequiredObjectAttributes")[self.object]["longitude"],
    )
    # Get the sun's elevation (altitude)
    # Sun altitude < 0 means the sun is below the horizon (night)
    is_night = solar_position["elevation"] < 0
    # Reset index to match df timestamps (convert back from UTC to local time)
    is_night.index = timestamps
    # zeroing result_hourly during night
    result_hourly.loc[is_night] = 0
    # Forward filling remaining NaN values during daytime
    result_hourly = result_hourly.ffill()

    # Resampling to daily values. Units as kWmed
    result_daily = result_hourly.resample("D").mean()

    t3 = perf_counter()

    # Final loss calculation
    result = result_daily

    # adding calculated feature to class result attribute
    self._result = result.copy()

    # saving results
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: Requirements during calc {t1 - t0:.2f}s - Data adjustments {t2 - t1:.2f}s - Model prediction {t3 - t2:.2f}s - Saving data {perf_counter() - t3:.2f}s",
    )

    return result

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
        )

    if save_into is None:
        return

    if isinstance(save_into, str):
        if save_into not in ["performance_db", "all"]:
            raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
        upload_to_bazefield = save_into == "all"
    elif save_into is None:
        upload_to_bazefield = False
    else:
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")

    # converting result series to DataFrame if needed
    if isinstance(self.result, Series):
        result_df = self.result.to_frame()
    elif isinstance(self.result, DataFrame):
        result_df = self.result.droplevel(0, axis=1)
    else:
        raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")

    # adjusting DataFrame to be inserted in the database
    # making the columns a Multindex with levels object_name and feature_name
    result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])

    self._perfdb.features.values.series.insert(
        df=result_df,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )