Skip to content

Solar Theoretical Power

Overview

The FeatureCalcPowerTheoreticalSolar class is a subclass of FeatureCalculator that calculates the value of solar power theoretical production using a predictive model. Currently this class expects that the model has been trained and save using the abstract class PredictiveModel and that the model is saved in the database as a pickle file.

This class uses a pre-trained random forest model that represents the inverter normal operation to calculate the theoretical power. The model is implemented using sklearn Random Forest method and trained using Reactive Power from the inverter and the following features from the weather stations: Irradiance, Module Temperature, Ambient Temperature, Humidity and Hour of the day. The model is trained per SPE and the script used to do this training process can be found in the performance server at manual_routines\postgres_fit_solar_power_predict.

Calculation Logic

The calculation works as follows:

  1. Load the Random Forest model from the database.
  2. Find the features that are needed to predict the value of the feature. For this calculation we have to gather inverter and simple / complete weather stations features.
  3. Get the values of these features.
  4. Predict the value of the feature using the model and the values of the features.
  5. Return the predicted value.
  6. Adjust predicted values to 0.0 kW during the night period. Here, we use pvlib's indication of night periods based on the latitude and longitude of the inverter.
  7. Clip the final energy to inverter nominal power (defined in the object attributes) if it ever exceeds.

Database Requirements

  • Feature attribute server_calc_type must be set to 'theoretical_active_power_solar'.
  • Feature attribute feature_options_json with the following keys:

    • calc_model_type: Type of the calculation model that will be used to calculate the feature. In the case: 'solar_power_curve'.
    • model_name: The name os the model to pe considered on the feature calculation. Example: 'solar_power_curve!ActivePowerSolar'
    • bazefield_features: A boolean indicating if the features comes from bazefield or not. For the solar prediction to work today, all values comes from bazefield database.

    Keep in mind that 'calc_model_type' and 'model_name' are only used to find the desired calculation model in the database. See views v_calculation_models and v_calculation_models_files_def for more details.

  • The features defined in the model must be present in the bazefield database with the names as when the model was trained. If the names of the features changed we will need to manually download the pickle file from the database, change the names of the features in the model and upload it again.

  • The following object attributes for the object that is being calculated:
    • Required:
      • reference_weather_stations: A dict indicating which simple and complete weather station to be considered during data acquisition. Example: {"simple_ws": "RBG-RBG2-MET1", "complete_ws": "RBG-RBG2-MET2"}
      • latitude: Geographical latitude of the inverter. To be consider to determine night periods.
      • longitude: Geographical longitude of the inverter. To be consider to determine night periods.
      • nominal_power: Nominal power from the inverter, used to clip result values.
  • The following feature for the object that is being calculated (from Bazefield database):
    • ReactivePower_5min.AVG: Reactive power from the inverter in kW
  • The following features for the simple weather station that is being calculated (from Bazefield database):
    • IrradiancePOACommOk_5min.AVG: Solar irradiance in W/m2
    • ModuleTempCommOk_5min.AVG: Temperature from module in °C
  • The following features for the complete weather station that is being calculated (from Bazefield database):
    • AmbTemp_5min.AVG: Ambient temperature in °C
    • Humidity_5min.AVG: Humidity in %

Class Definition

FeatureCalcPowerTheoreticalSolar(object_name, feature)

Class used to calculate the theoretical active power for solar inverters.

For this class to work, the feature must have the attribute feature_options_json with the following keys:

  • calc_model_type: Type of the calculation model that will be used to calculate the feature.
  • model_name: Name of the feature that the model was trained to predict.
  • bazefield_features: bool indicating if the required features needs to be acquired from bazefield.

Keep in mind that calc_model_type and model_name will be used to filter the calculation models in the database looking for just ONE that matches both.

The class will handle getting all the necessary features for the model to work based on what was defined when the model was trained.

Parameters:

  • object_name

    (str) –

    Name of the object for which the feature is calculated. It must exist in performance_db.

  • feature

    (str) –

    Feature of the object that is calculated. It must exist in performance_db.

Source code in echo_energycalc/feature_calc_power_theoretical_solar.py
def __init__(
    self,
    object_name: str,
    feature: str,
) -> None:
    """
    Class used to calculate features that depend on a PredictiveModel.

    For this class to work, the feature must have the attribute `feature_options_json` with the following keys:

    - `calc_model_type`: Type of the calculation model that will be used to calculate the feature.
    - `model_name`: Name of the feature that the model was trained to predict.
    - `bazefield_features`: bool indicating if the required features needs to be acquired from bazefield.

    Keep in mind that `calc_model_type` and `model_name` will be used to filter the calculation models in the database looking for just ONE that matches both.

    The class will handle getting all the necessary features for the model to work based on what was defined when the model was trained.

    Parameters
    ----------
    object_name : str
        Name of the object for which the feature is calculated. It must exist in performance_db.
    feature : str
        Feature of the object that is calculated. It must exist in performance_db.
    """
    # initialize parent class
    super().__init__(object_name, feature)

    self._add_requirement(RequiredFeatureAttributes(self.object, self.feature, ["feature_options_json"]))

    self._get_required_data()

    self._feature_attributes = self._get_requirement_data("RequiredFeatureAttributes")[self.feature]

    self._validate_feature_options()

    self._add_requirement(
        RequiredCalcModels(
            calc_models={
                self.object: [
                    {
                        "model_name": f".*{self._feature_attributes['feature_options_json']['model_name']}.*",
                        "model_type": f"^{self._feature_attributes['feature_options_json']['calc_model_type']}$",
                    },
                ],
            },
        ),
    )
    self._add_requirement(
        RequiredObjectAttributes(
            {
                self.object: [
                    "reference_weather_stations",
                    "latitude",
                    "longitude",
                    "nominal_power",
                ],
            },
        ),
    )
    self._get_required_data()

    # getting the model name
    self._model_name = next(iter(self._get_requirement_data("RequiredCalcModels")[self.object].keys()))

    # loading calculation model from file
    try:
        self._model: SolarPowerRFPredictiveModel = self._get_requirement_data("RequiredCalcModels")[self.object][self._model_name][
            "model"
        ]
        if not isinstance(self._model, SolarPowerRFPredictiveModel):
            raise TypeError(f"'{self.object}' is not an instance of a subclass of SolarPowerRFPredictiveModel.")
        self._model._deserialize_model()  # noqa: SLF001

    except Exception as e:
        raise RuntimeError(f"'{self.object}' failed to load SolarPowerRFPredictiveModel.") from e

    # checking if model object is an instance of a subclass of SolarPowerRFPredictiveModel
    if not isinstance(self._model, SolarPowerRFPredictiveModel):
        raise TypeError(f"'{self.object}' is not an instance of a subclass of SolarPowerRFPredictiveModel.")

    # defining required features
    reference_features = [
        feat
        for feat in self._model.model_arguments.reference_features
        if feat not in getattr(self._model.model_arguments, "ignore_baze_object_features", [])
    ]
    simple_ws = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["simple_ws"]
    complete_ws = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["complete_ws"]

    features = {
        self.object: reference_features,
        simple_ws: self._model.model_arguments.simple_ws_features,
        complete_ws: self._model.model_arguments.complete_ws_features,
    }

    # Adiciona sufixo _b# se bazefield_features for True
    if self._feature_attributes["feature_options_json"].get("bazefield_features", False):
        features = {obj: [f"{feat}_b#" for feat in feats] for obj, feats in features.items()}
    self._add_requirement(RequiredFeatures(features=features))

    # checking if model has more than one target feature
    if len(self._model.model_arguments.target_features) > 1:
        raise NotImplementedError("SolarPowerRFPredictiveModel with more than one target feature is not supported yet.")

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • Series | DataFrame | None:

    Result of the calculation if the method "calculate" was called. None otherwise.

calculate(period, save_into=None, cached_data=None, **kwargs)

Method that will calculate the feature.

Parameters:

  • period

    (DateTimeRange) –

    Period for which the feature will be calculated.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • cached_data

    (DataFrame | None, default: None ) –

    DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None

  • **kwargs

    (dict, default: {} ) –

    Additional arguments that will be passed to the "save" method.

Returns:

  • Series

    Pandas Series with the calculated feature.

Source code in echo_energycalc/feature_calc_power_theoretical_solar.py
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: DataFrame | None = None,
    **kwargs,
) -> Series:
    """
    Method that will calculate the feature.

    Parameters
    ----------
    period : DateTimeRange
        Period for which the feature will be calculated.
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    cached_data : DataFrame | None, optional
        DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
        By default None
    **kwargs : dict, optional
        Additional arguments that will be passed to the "save" method.

    Returns
    -------
    Series
        Pandas Series with the calculated feature.
    """
    t0 = perf_counter()

    # adjusting period to account for lagged timestamps
    adjusted_period = period.copy()

    # creating a series to store the result
    result = self._create_empty_result(period=adjusted_period, freq="5min", result_type="Series")

    # getting feature values
    self._get_required_data(
        period=adjusted_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )

    # getting DataFrame with feature values
    df = self._get_requirement_data("RequiredFeatures")

    # Cria um índice 'object' fictício com o nome 'Inverter'
    df["object"] = "Inverter"
    df = df.set_index("object", append=True)
    df.index = df.index.set_names(["timestamp", "object"])
    df.index = df.index.swaplevel("object", "timestamp")
    df = df.sort_index()
    df.columns = df.columns.get_level_values("feature")
    # adjusting dtype of index
    df.index = df.index.set_levels(df.index.levels[0].astype("string[pyarrow]"), level=0)
    df.index = df.index.set_levels(df.index.levels[1].astype("datetime64[s]"), level=1)

    # Remove the suffix _b# from the columns
    df.columns = df.columns.str.replace("_b#$", "", regex=True)

    t1 = perf_counter()

    # Torna NaN a Irradiação quando todas as outras colunas são nulas, isso é necessário devido a forma como o bazefield integra os valores de irradiação
    cols_to_check_for_nan = [col for col in df.columns if col not in ["object", "timestamp", "IrradiancePOACommOk_5min.AVG"]]
    # Check when Irradiance is 0
    cond_irradiance_is_zero = df["IrradiancePOACommOk_5min.AVG"] == 0
    # Check if all other columns are NaN
    cond_others_are_nan = df[cols_to_check_for_nan].isna().all(axis=1)
    # Attribute Nan values whenever irradiance is zero and all other columns are NaN
    mask = cond_irradiance_is_zero & cond_others_are_nan
    df.loc[mask, "IrradiancePOACommOk_5min.AVG"] = np.nan

    # Preenche NaNs das colunas específicas com o último valor válido
    cols_ffill = [
        "ModuleTempCommOk_5min.AVG",
        "AmbTemp_5min.AVG",
        "Humidity_5min.AVG",
        "IrradiancePOACommOk_5min.AVG",
    ]
    df[cols_ffill] = df[cols_ffill].ffill()

    # Preenche NaNs da coluna 'ReactivePower_5min.AVG' com valores aleatórios entre -1 e 1 (2 casas decimais)
    # condição para fazer isso apenas se ReactivePower_5min.AVG esteja nas colunas de df
    if "ReactivePower_5min.AVG" in df.columns:
        mask = df["ReactivePower_5min.AVG"].isna()
        n_missing = mask.sum()
        np_gen = np.random.default_rng()
        if n_missing > 0:
            df.loc[mask, "ReactivePower_5min.AVG"] = np.round(
                np_gen.uniform(-1, 1, size=n_missing),
                2,
            )
    # converting the data to numpy float32 for compatibility with tensorflow
    df = df.astype("float32")

    t2 = perf_counter()

    # only predict if there is data
    if not df.empty:
        # predicting values
        model_output = self._model.predict(df)
        # dropping one level from the index
        model_output = model_output.droplevel("object")

        # adding output to results
        wanted_idx = result.index.intersection(model_output.index)
        result.loc[wanted_idx] = model_output.loc[wanted_idx, self._model.model_arguments.target_features[0]].values

    t3 = perf_counter()

    # trimming result to the original period
    result = result[(result.index >= period.start) & (result.index <= period.end)].copy()

    timestamps = result.index

    # adding 3 hours to convert to UTC
    times_pd = timestamps + Timedelta(hours=3)

    # Calculate solar positions (zenith angles) - vectorized operation
    solar_position = pvlib.solarposition.get_solarposition(
        time=times_pd,
        latitude=self._get_requirement_data("RequiredObjectAttributes")[self.object]["latitude"],
        longitude=self._get_requirement_data("RequiredObjectAttributes")[self.object]["longitude"],
    )

    # Get the sun's elevation (altitude)
    # Sun altitude < 0 means the sun is below the horizon (night)
    is_night = solar_position["elevation"] < 0
    # Zera valores durante o período noturno usando is_night
    result.loc[is_night.values] = 0.0

    # Clip values to a maximum of 330
    result = result.clip(upper=self._get_requirement_data("RequiredObjectAttributes")[self.object]["nominal_power"])
    # adding calculated feature to class result attribute
    self._result = result.copy()

    # saving results
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: Requirements during calc {t1 - t0:.2f}s - Data adjustments {t2 - t1:.2f}s - Model prediction {t3 - t2:.2f}s - Final adjustments {perf_counter() - t3:.2f}s",
    )

    return result

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
        )

    if save_into is None:
        return

    if isinstance(save_into, str):
        if save_into not in ["performance_db", "all"]:
            raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
        upload_to_bazefield = save_into == "all"
    elif save_into is None:
        upload_to_bazefield = False
    else:
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")

    # converting result series to DataFrame if needed
    if isinstance(self.result, Series):
        result_df = self.result.to_frame()
    elif isinstance(self.result, DataFrame):
        result_df = self.result.droplevel(0, axis=1)
    else:
        raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")

    # adjusting DataFrame to be inserted in the database
    # making the columns a Multindex with levels object_name and feature_name
    result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])

    self._perfdb.features.values.series.insert(
        df=result_df,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )