Skip to content

Wind Farm Wind Speed

Overview

The FeatureCalcWFReferenceWS class is a subclass of FeatureCalculator that calculates a reference value for the wind speed of a wind farm. This is considered the most representative and reliable wind speed value for the wind farm, and it is used as a reference for other calculations, including the wind speed KPI.

Calculation Logic

The calculation logic is described in the constructor of the class, shown below in the Class Definition.

Database Requirements

  • Feature attribute server_calc_type must be set to wind_farm_reference_wind_speed.
  • The following object attributes for the SPE object:
    • reference_met_masts: List of met masts that are considered reference met masts to the SPE object, in order of proximity.
    • met_mast_wind_speed_regressions: List of dictionaries containing the regression parameters for the wind speed of the met masts. It is used to correct data from neighbors in case the data from this SPE is absent. Below there is an example of the dictionary:

      {
          "neighbor1": {"slope": 1, "offset": 0},
          "neighbor2": {"slope": 1, "offset": 0},
          ...
      }
      

      This can be calculated using the scrip at performance server in folder manual_routines\postgres_calc_regressions. - The following features for the reference met masts: - wind_speed_1_avg: Wind speed in m/s. - The following features for the SPE object: - wind_speed_turbine_avg: Wind speed in m/s from all turbines in the SPE. Only used in case the data from the met masts is absent.

Class Definition

FeatureCalcWFReferenceWS(object_name, feature)

Class used to calculate reference wind speed for a wind farm.

This reference wind speed is calculated going through each of the met masts defined in reference_met_masts attribute of the wind farm and coping it's values corrected by the regression defined in met_mast_wind_speed_regressions attribute of the SPE if present.

The second, third, etc. met masts are only used if the first one is not available for all the time period.

At the end, if there are missing timestamps, will get the average of wind speed from all wind turbines of this wind farm.

The final goal is to have the most representative value for the wind speed at the wind farm avoiding any NaN values.

Parameters:

  • object_name

    (str) –

    Name of the object for which the feature is calculated. It must exist in performance_db.

  • feature

    (str) –

    Feature of the object that is calculated. It must exist in performance_db.

Source code in echo_energycalc/feature_calc_ws_reference.py
def __init__(
    self,
    object_name: str,
    feature: str,
) -> None:
    """
    Class used to calculate reference wind speed for a wind farm.

    This reference wind speed is calculated going through each of the met masts defined in `reference_met_masts` attribute of the wind farm and coping it's values corrected by the regression defined in `met_mast_wind_speed_regressions` attribute of the SPE if present.

    The second, third, etc. met masts are only used if the first one is not available for all the time period.

    At the end, if there are missing timestamps, will get the average of wind speed from all wind turbines of this wind farm.

    The final goal is to have the most representative value for the wind speed at the wind farm avoiding any NaN values.

    Parameters
    ----------
    object_name : str
        Name of the object for which the feature is calculated. It must exist in performance_db.
    feature : str
        Feature of the object that is calculated. It must exist in performance_db.
    """
    # initialize parent class
    super().__init__(object_name, feature)

    # required reference met masts
    self._add_requirement(RequiredObjectAttributes({self.object: ["reference_met_masts"]}))
    self._get_required_data()

    # required met_mast_wind_speed_regressions (first met mast only)
    self._add_requirement(
        RequiredObjectAttributes(
            {
                self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_met_masts"][0]: [
                    "met_mast_wind_speed_regressions",
                ],
            },
            optional=True,
        ),
    )
    self._get_required_data()

    # amount of timestamps that is acceptable to have NaN values to avoid long calculations trying to fill all NaNs
    self._max_nan = 5

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • Series | DataFrame | None:

    Result of the calculation if the method "calculate" was called. None otherwise.

calculate(period, save_into=None, cached_data=None, **kwargs)

Method that will calculate the feature.

Parameters:

  • period

    (DateTimeRange) –

    Period for which the feature will be calculated.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • cached_data

    (DataFrame | None, default: None ) –

    DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None

  • **kwargs

    (dict, default: {} ) –

    Additional arguments that will be passed to the "save" method.

Returns:

  • Series

    Pandas Series with the calculated feature.

Source code in echo_energycalc/feature_calc_ws_reference.py
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: DataFrame | None = None,
    **kwargs,
) -> Series:
    """
    Method that will calculate the feature.

    Parameters
    ----------
    period : DateTimeRange
        Period for which the feature will be calculated.
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    cached_data : DataFrame | None, optional
        DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
        By default None
    **kwargs : dict, optional
        Additional arguments that will be passed to the "save" method.

    Returns
    -------
    Series
        Pandas Series with the calculated feature.
    """
    # creating Series to store results
    result = self._create_empty_result(period=period, result_type="Series")
    # converting to NA enabled Series
    result = result.astype("Float64")

    # getting the base met mast
    base_met_mast = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_met_masts"][0]
    # getting met_mast_wind_speed_regressions
    regressions = {}
    if (
        base_met_mast in self._get_requirement_data("RequiredObjectAttributes")
        and "met_mast_wind_speed_regressions" in self._get_requirement_data("RequiredObjectAttributes")[base_met_mast]
    ):
        regressions = self._get_requirement_data("RequiredObjectAttributes")[base_met_mast]["met_mast_wind_speed_regressions"]

    # iterating over all reference met masts
    for ref_mast in self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_met_masts"]:
        # skipping result has no NaNs
        if result.isna().sum() < self._max_nan:
            break

        # defining only required period (start is the timestamp of the first NaN and end is the timestamp of the last NaN)
        this_period = DateTimeRange(start=result[result.isna()].index[0], end=result[result.isna()].index[-1])

        # adding data for this reference met mast as a requirement
        req_features = {ref_mast: ["WindSpeed1_10min.AVG"]}
        self._add_requirement(RequiredFeatures(req_features))

        # getting required data for this reference met mast
        self._get_required_data(period=this_period, reindex="10min", only_missing=True, cached_data=cached_data)

        # checking if met_mast_wind_speed_regressions is defined
        regression = {"slope": 1, "offset": 0}
        if ref_mast in regressions and "slope" in regressions[ref_mast] and "offset" in regressions[ref_mast]:
            regression = regressions[ref_mast]

        # getting data for this reference met mast and applying regression
        ref_series = self._get_requirement_data("RequiredFeatures").loc[:, IndexSlice[ref_mast, "WindSpeed1_10min.AVG"]].copy()
        ref_series = ref_series * regression["slope"] + regression["offset"]

        na_count = result.isna().sum()

        # converting to NA enabled Series to match result
        ref_series = ref_series.astype("Float64")

        # filling NaN values in result with values from this reference met mast
        result = result.combine_first(ref_series)

        logger.info(
            f"'{self.object}' - '{self.feature}': Calculated reference wind speed using met mast '{ref_mast}' - {na_count - result.isna().sum()} timestamps filled - {result.isna().sum()} remaining.",
        )

    # getting average wind speed from all wind turbines if there are still NaN values
    if result.isna().sum() > 0:
        # adding data for all wind turbines as a requirement
        req_features = {self.object: ["WindSpeed_10min.AVG"]}
        self._add_requirement(RequiredFeatures(req_features))

        self._get_required_data(period=period, reindex="10min", only_missing=True, cached_data=cached_data)
        turbines_series = self._get_requirement_data("RequiredFeatures").loc[:, IndexSlice[self.object, "WindSpeed_10min.AVG"]].copy()

        na_count = result.isna().sum()

        # filling NaN values in result with values from all wind turbines
        result = result.combine_first(turbines_series)

        logger.info(
            f"'{self.object}' - '{self.feature}': Calculated reference wind speed using average wind speed from all wind turbines - {na_count - result.isna().sum()} timestamps filled - {result.isna().sum()} remaining.",
        )

    # checking if result has NaNs
    nan_count = result.isna().sum()
    if nan_count > 0:
        logger.warning(f"'{self.object}' - '{self.feature}': Could not calculate {nan_count} - {nan_count / len(result):.2%} values.")

    # adding calculated feature to class result attribute
    self._result = result.copy()

    # saving results
    self.save(save_into=save_into, **kwargs)

    return result

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
        )

    if save_into is None:
        return

    if isinstance(save_into, str):
        if save_into not in ["performance_db", "all"]:
            raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
        upload_to_bazefield = save_into == "all"
    elif save_into is None:
        upload_to_bazefield = False
    else:
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")

    # converting result series to DataFrame if needed
    if isinstance(self.result, Series):
        result_df = self.result.to_frame()
    elif isinstance(self.result, DataFrame):
        result_df = self.result.droplevel(0, axis=1)
    else:
        raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")

    # adjusting DataFrame to be inserted in the database
    # making the columns a Multindex with levels object_name and feature_name
    result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])

    self._perfdb.features.values.series.insert(
        df=result_df,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )