Skip to content

Spectrum Amplitude

Overview

The FeatureCalcSpectrumAmplitude class is a subclass of FeatureCalculator that calculates the amplitude of the spectrum of a signal at a certain frequency. This allows for creating features that show the trend in time of the amplitude of a signal at a certain frequency, enabling us to quickly determine if a certain component associated with that frequency is vibrating more than usual, which could indicate a problem.

Calculation Logic

The calculation logic is described in the constructor of the class, shown below in the Class Definition section.

Database Requirements

  • Feature attribute server_calc_type must be set to spectrum_amplitude.
  • Feature attribute feature_options_json with the following keys:.
    • frequency_name: Name of the subcomponent attribute that contains the value of the desired frequency, not including the "FREQ-" prefix. This must be in orders and will be considered as the center frequency of the band.
    • sensor_name: Name of the sensor that will be used to get the spectrum. Names must be the ones available for perfdb.vibration.spectrum.get method.
    • acquisition_rate: Acquisition rate of the sensor. Only applicable for Gamesa turbines. For other manufacturers set this to none.
    • spectrum_type: Type of the spectrum that will be used. Options are: 'Normal' or 'Envelope'.
    • band_width: Width of the band that will be used to calculate the amplitude. This to allow for getting values not only for the center frequency but also for the surrounding frequencies. Consider that the band will be centered in the frequency defined in frequency_name and will have a width of band_width/2 in each side.
    • operation: Which operation will be used to aggregate all amplitude values in the band. Options are all available for pandas.Series, but in most cases max or mean should be used.
  • Data in raw_data_values table that correspond to the desired object, sensor, acquisition rate, and time range.

Below there is an example of the queries needed to create a feature that uses the FeatureCalcSpectrumAmplitude class:

-- Create the feature
SELECT * FROM performance.fn_create_or_update_feature(
    'G97-2.07', -- name of the object model
    'server_calc', -- name of the data source type (server_calc)
    'test_spectrum_amplitude', -- name of the feature
    'Test feature for FeatureCalcSpectrumAmplitude', -- description of the feature
    NULL, -- name of the feature in the data source (not applicable for calculations)
    NULL, -- id of the feature in the data source (not applicable for calculations)
    NULL, -- unit of the feature
    NULL -- leave as NULL, deprecated
);

-- Set the feature attribute `server_calc_type`
SELECT * FROM performance.fn_set_feature_attribute(
    'test_spectrum_amplitude', -- name of the feature
    'G97-2.07', -- name of the object model
    'server_calc_type', -- name of the attribute
    '{"attribute_value": "spectrum_amplitude"}' -- value of the attribute
);

-- Set the feature attribute `feature_options_json`
SELECT * FROM performance.fn_set_feature_attribute(
    'test_spectrum_amplitude', -- name of the feature
    'G97-2.07', -- name of the object model
    'feature_options_json', -- name of the attribute
    '{"attribute_value": {"frequency_name": "HSS-RF", "sensor_name": "4 - HSS - Radial", "acquisition_rate": "High", "spectrum_type": "Normal", "band_width": 0.1, "operation": "max"}}' -- value of the attribute
);

Class Definition

FeatureCalcSpectrumAmplitude(object_name, feature)

FeatureCalculator class for features that are based on the amplitude of the vibration spectrum of a wind turbine.

The method will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.

For this to work the feature must have attribute feature_options_json with the following keys:

  • frequency_name: Name of the subcomponent attribute that contains the value of the desired frequency, not including the "FREQ-" prefix. This must be in orders and will be considered as the center frequency of the band.
  • sensor_name: Name of the sensor that will be used to get the spectrum. Names must be the ones available for perfdb.vibration.spectrum.get method.
  • acquisition_rate: Acquisition rate of the sensor. Only applicable for Gamesa turbines. For other manufacturers set this to none.
  • spectrum_type: Type of the spectrum that will be used. Options are: 'Normal' or 'Envelope'.
  • band_width: Width of the band that will be used to calculate the amplitude. This to allow for getting values not only for the center frequency but also for the surrounding frequencies. Consider that the band will be centered in the frequency defined in frequency_name and will have a width of band_width/2 in each side.
  • operation: Which operation will be used to aggregate all amplitude values in the band. Options are all available for pandas.Series, but in most cases max or mean should be used.

Parameters:

  • object_name

    (str) –

    Name of the object for which the feature is calculated. It must exist in performance_db.

  • feature

    (str) –

    Feature of the object that is calculated. It must exist in performance_db.

Source code in echo_energycalc/feature_calc_spectrum_amplitude.py
def __init__(
    self,
    object_name: str,
    feature: str,
) -> None:
    """
    FeatureCalculator class for features that are based on the amplitude of the vibration spectrum of a wind turbine.

    The method will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.

    For this to work the feature must have attribute `feature_options_json` with the following keys:

    - `frequency_name`: Name of the subcomponent attribute that contains the value of the desired frequency, not including the "FREQ-" prefix. This must be in orders and will be considered as the center frequency of the band.
    - `sensor_name`: Name of the sensor that will be used to get the spectrum. Names must be the ones available for `perfdb.vibration.spectrum.get` method.
    - `acquisition_rate`: Acquisition rate of the sensor. Only applicable for Gamesa turbines. For other manufacturers set this to none.
    - `spectrum_type`: Type of the spectrum that will be used. Options are: 'Normal' or 'Envelope'.
    - `band_width`: Width of the band that will be used to calculate the amplitude. This to allow for getting values not only for the center frequency but also for the surrounding frequencies. Consider that the band will be centered in the frequency defined in `frequency_name` and will have a width of `band_width`/2 in each side.
    - `operation`: Which operation will be used to aggregate all amplitude values in the band. Options are all available for pandas.Series, but in most cases `max` or `mean` should be used.

    Parameters
    ----------
    object_name : str
        Name of the object for which the feature is calculated. It must exist in performance_db.
    feature : str
        Feature of the object that is calculated. It must exist in performance_db.
    """
    # initialize parent class
    super().__init__(object_name, feature)

    # requirements for the feature calculator
    self._add_requirement(RequiredFeatureAttributes(self.object, self.feature, ["feature_options_json"]))
    self._get_required_data()

    # validating feature options
    self._validate_feature_options()

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • Series | DataFrame | None:

    Result of the calculation if the method "calculate" was called. None otherwise.

calculate(period, save_into=None, cached_data=None, **kwargs)

Method that will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.

Parameters:

  • period

    (DateTimeRange) –

    Period for which the feature will be calculated.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • cached_data

    (DataFrame | None, default: None ) –

    DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None

  • **kwargs

    (dict, default: {} ) –

    Additional arguments that will be passed to the "_save" method.

Returns:

  • Series

    Pandas Series with the calculated feature.

Source code in echo_energycalc/feature_calc_spectrum_amplitude.py
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: DataFrame | None = None,
    **kwargs,
) -> Series:
    """
    Method that will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.

    Parameters
    ----------
    period : DateTimeRange
        Period for which the feature will be calculated.
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    cached_data : DataFrame | None, optional
        DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
        By default None
    **kwargs : dict, optional
        Additional arguments that will be passed to the "_save" method.

    Returns
    -------
    Series
        Pandas Series with the calculated feature.
    """
    # getting required vibration data
    self._get_required_data(period=period, cached_data=cached_data, only_missing=True)

    # getting vibration data
    vibration_df: DataFrame = self._get_requirement_data("RequiredVibrationData").copy()

    # creating series for the result
    result = self._create_empty_result(period=period, result_type="Series")

    # filtering vibration data
    vibration_df = vibration_df[
        (vibration_df["sensor"] == self._feature_options["sensor_name"])
        & (vibration_df["timestamp"].between(period.start, period.end))
        & (vibration_df["object_name"] == self.object)
        & (vibration_df["spectrum_type"] == self._feature_options["spectrum_type"])
    ].copy()

    # acquisition frequency
    if self._feature_options["acquisition_rate"] is not None:
        vibration_df = vibration_df[vibration_df["acquisition_frequency"] == self._feature_options["acquisition_rate"]].copy()

    # if vibration_df is empty, return empty result
    if vibration_df.empty:
        logger.debug(f"No vibration data found for object {self.object} in period {period}.")

        result = result.dropna()

        return result

    # reindexing vibration_df with the index in result to match 10 min frequency
    vibration_df["timestamp"] = vibration_df["timestamp"].astype("datetime64[s]")
    vibration_df = vibration_df.set_index("timestamp")
    vibration_df = vibration_df.reindex(result.index, method="nearest", tolerance=timedelta(minutes=4, seconds=59))
    vibration_df.index.name = "timestamp"
    vibration_df = vibration_df.reset_index()
    # dropping all NA values
    vibration_df = vibration_df.dropna(subset=["value"])

    # getting frequency center
    frequency_center = self._freq_data.copy()
    # reindexing frequency_center to the index in result using forward fill method
    frequency_center = frequency_center.reindex(result.index, method="ffill")
    frequency_center.index.name = "timestamp"
    # renaming the series
    frequency_center.name = "frequency_center"

    # making both time columns have the same data type datetime64[s]
    frequency_center = frequency_center.to_frame().reset_index()
    frequency_center["timestamp"] = frequency_center["timestamp"].astype("datetime64[s]")

    # merging frequency_center with vibration_df
    vibration_df = vibration_df.merge(frequency_center, on="timestamp", how="left")

    # calculating band limits
    vibration_df["band_left"] = vibration_df["frequency_center"] - self._feature_options["band_width"] / 2
    vibration_df["band_right"] = vibration_df["frequency_center"] + self._feature_options["band_width"] / 2

    vibration_df = vibration_df.astype({"band_left": "float64", "band_right": "float64"})

    # converting value to numpy arrays for faster iteration
    values = vibration_df["value"].values
    band_lefts = vibration_df["band_left"].values
    band_rights = vibration_df["band_right"].values

    # creating empty list to store the results for each timestamp
    # this will later be inserted into the result series
    temp_result = []

    # getting the operation that will be used to aggregate the values in the band
    operation_name = self._feature_options["operation"]

    # iterating over the vibration data
    for i in range(len(values)):
        # getting the spectrum
        spectrum: ndarray = values[i]

        # getting the band limits
        band_left: float = band_lefts[i]
        band_right: float = band_rights[i]

        # finding indexes of the value that are in the band
        # consider that value is a 2D numpy array (2, N), where dimension 1 has the x axis values (frequency) and dimension 2 has the y axis values (amplitude)
        indexes = (spectrum[0] >= band_left) & (spectrum[0] <= band_right)

        # if there are no values in the band, append NA to the result
        if not indexes.any():
            temp_result.append(NA)
            continue

        # getting the values in the band
        values_in_band = spectrum[1][indexes]

        # aggregating the values in the band
        temp_result.append(getattr(Series(values_in_band), operation_name)())

    # creating a temporary series with the results
    result = Series(temp_result, index=vibration_df["timestamp"], name=result.name, dtype="float64")

    # dropping NA values
    result = result.dropna()

    # adding calculated feature to class result attribute
    self._result = result.copy()

    # saving results
    self.save(save_into=save_into, **kwargs)

    return result

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
        )

    if save_into is None:
        return

    if isinstance(save_into, str):
        if save_into not in ["performance_db", "all"]:
            raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
        upload_to_bazefield = save_into == "all"
    elif save_into is None:
        upload_to_bazefield = False
    else:
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")

    # converting result series to DataFrame if needed
    if isinstance(self.result, Series):
        result_df = self.result.to_frame()
    elif isinstance(self.result, DataFrame):
        result_df = self.result.droplevel(0, axis=1)
    else:
        raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")

    # adjusting DataFrame to be inserted in the database
    # making the columns a Multindex with levels object_name and feature_name
    result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])

    self._perfdb.features.values.series.insert(
        df=result_df,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )