Time Series Amplitude¶
Overview¶
The FeatureCalcTimeseriesAmplitude class is a subclass of FeatureCalculator that calculates the amplitude of the time series of a vibration signal. This allows for creating features that show the trend in time of the amplitude of a signal, enabling us to quickly check tendencies associated with vibration and blade gap data.
Calculation Logic¶
The calculation logic is described in the constructor of the class, shown below in the Class Definition section.
Database Requirements¶
- Feature attribute
server_calc_typemust be set totimeseries_amplitude. - Feature attribute
feature_options_jsonwith the following keys:.sensor_name: Name of the sensor that will be used to get the timeseries. Names must be the ones available forperfdb.vibration.timeseries.getmethod.data_type: Type of the data that will be used to get the timeseries. All values allowed inperfdb.vibration.timeseries.getmethod are valid. For example, "Vibration", etc.acquisition_rate: Acquisition rate of the sensor. Only applicable for Gamesa turbines and vibration sensors. For other manufacturers set this to none.variable_name: Name of the variable that will be used to get the timeseries. This is only applicable for Gamesa turbines and blade gap sensors (deprecated). For other manufacturers set this to none.operation: Which time of amplitude calculation will be used. Valid values are "peak", "peak-to-peak", "rms", "mean", "median", "std".
- Data in
raw_data_valuestable that corresponds to the desired object, sensor, acquisition rate, variable_name and time range.
Below there is an example of the queries needed to create a feature that uses the FeatureCalcTimeseriesAmplitude class:
-- Create the feature
SELECT * FROM performance.fn_create_or_update_feature(
'G97-2.07', -- name of the object model
'server_calc', -- name of the data source type (server_calc)
'test_timeseries_amplitude', -- name of the feature
'Test for FeatureCalcTimeseriesAmplitude', -- description of the feature
NULL, -- name of the feature in the data source (not applicable for calculations)
NULL, -- id of the feature in the data source (not applicable for calculations)
NULL, -- unit of the feature
NULL -- leave as NULL, deprecated
);
-- Set the feature attribute `server_calc_type`
SELECT * FROM performance.fn_set_feature_attribute(
'test_timeseries_amplitude', -- name of the feature
'G97-2.07', -- name of the object model
'server_calc_type', -- name of the attribute
'{"attribute_value": "timeseries_amplitude"}' -- value of the attribute
);
-- Set the feature attribute `feature_options_json`
SELECT * FROM performance.fn_set_feature_attribute(
'test_timeseries_amplitude', -- name of the feature
'G97-2.07', -- name of the object model
'feature_options_json', -- name of the attribute
'{"attribute_value": {"sensor_name": "Blade C", "data_type": "Blade Gap", "acquisition_rate": null, "variable_name": "Position - Y", "operation": "peak-to-peak"}}' -- value of the attribute
);
Class Definition¶
FeatureCalcTimeseriesAmplitude(object_name, feature)
¶
FeatureCalculator class for features that are based on the amplitude of the vibration time series of a wind turbine.
The calculation is fairly simple:
- Get the vibration timeseries for the specified sensor, data type, acquisition rate, and variable name.
- Apply the desired operation (peak, peak-to-peak, rms, mean, median, std) to the timeseries data. This will reduce the timeseries to a single value for each timestamp.
For this to work the feature must have attribute feature_options_json with the following keys:
sensor_name: Name of the sensor that will be used to get the timeseries. Names must be the ones available forperfdb.vibration.timeseries.getmethod.data_type: Type of the data that will be used to get the timeseries. All values allowed inperfdb.vibration.timeseries.getmethod are valid. For example, "Vibration", etc.acquisition_rate: Acquisition rate of the sensor. Only applicable for Gamesa turbines and vibration sensors. For other manufacturers set this to none.variable_name: Name of the variable that will be used to get the timeseries. This is only applicable for Gamesa turbines and blade gap sensors (deprecated). For other manufacturers set this to none.operation: Which time of amplitude calculation will be used. Valid values are "peak", "peak-to-peak", "rms", "mean", "median", "std".
Parameters:
-
(object_name¶str) –Name of the object for which the feature is calculated. It must exist in performance_db.
-
(feature¶str) –Feature of the object that is calculated. It must exist in performance_db.
Source code in echo_energycalc/feature_calc_timeseries_amplitude.py
def __init__(
self,
object_name: str,
feature: str,
) -> None:
"""
FeatureCalculator class for features that are based on the amplitude of the vibration time series of a wind turbine.
The calculation is fairly simple:
1. Get the vibration timeseries for the specified sensor, data type, acquisition rate, and variable name.
2. Apply the desired operation (peak, peak-to-peak, rms, mean, median, std) to the timeseries data. This will reduce the timeseries to a single value for each timestamp.
For this to work the feature must have attribute `feature_options_json` with the following keys:
- `sensor_name`: Name of the sensor that will be used to get the timeseries. Names must be the ones available for `perfdb.vibration.timeseries.get` method.
- `data_type`: Type of the data that will be used to get the timeseries. All values allowed in `perfdb.vibration.timeseries.get` method are valid. For example, "Vibration", etc.
- `acquisition_rate`: Acquisition rate of the sensor. Only applicable for Gamesa turbines and vibration sensors. For other manufacturers set this to none.
- `variable_name`: Name of the variable that will be used to get the timeseries. This is only applicable for Gamesa turbines and blade gap sensors (deprecated). For other manufacturers set this to none.
- `operation`: Which time of amplitude calculation will be used. Valid values are "peak", "peak-to-peak", "rms", "mean", "median", "std".
Parameters
----------
object_name : str
Name of the object for which the feature is calculated. It must exist in performance_db.
feature : str
Feature of the object that is calculated. It must exist in performance_db.
"""
# initialize parent class
super().__init__(object_name, feature)
# requirements for the feature calculator
self._add_requirement(RequiredFeatureAttributes(self.object, self.feature, ["feature_options_json"]))
self._get_required_data()
# validating feature options
self._validate_feature_options()
feature
property
¶
Feature that is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Name of the feature that is calculated.
name
property
¶
Name of the feature calculator. Is defined in child classes of FeatureCalculator.
This must be equal to the "server_calc_type" attribute of the feature in performance_db.
Returns:
-
str–Name of the feature calculator.
object
property
¶
Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Object name for which the feature is calculated.
requirements
property
¶
List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.
Returns:
-
dict[str, list[CalculationRequirement]]–Dict of requirements.
The keys are the names of the classes of the requirements and the values are lists of requirements of that class.
For example:
{"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}
result
property
¶
Result of the calculation. This is None until the method "calculate" is called.
Returns:
-
Series | DataFrame | None:–Result of the calculation if the method "calculate" was called. None otherwise.
calculate(period, save_into=None, cached_data=None, **kwargs)
¶
Method that will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.
Parameters:
-
(period¶DateTimeRange) –Period for which the feature will be calculated.
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.
By default None.
-
(cached_data¶DataFrame | None, default:None) –DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None
-
(**kwargs¶dict, default:{}) –Additional arguments that will be passed to the "_save" method.
Returns:
-
Series–Pandas Series with the calculated feature.
Source code in echo_energycalc/feature_calc_timeseries_amplitude.py
def calculate(
self,
period: DateTimeRange,
save_into: Literal["all", "performance_db"] | None = None,
cached_data: DataFrame | None = None,
**kwargs,
) -> Series:
"""
Method that will calculate the amplitude of a certain frequency band in the vibration spectrum of a wind turbine.
Parameters
----------
period : DateTimeRange
Period for which the feature will be calculated.
save_into : Literal["all", "performance_db"] | None, optional
Argument that will be passed to the method "save". The options are:
- "all": The feature will be saved in performance_db and bazefield.
- "performance_db": the feature will be saved only in performance_db.
- None: The feature will not be saved.
By default None.
cached_data : DataFrame | None, optional
DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
By default None
**kwargs : dict, optional
Additional arguments that will be passed to the "_save" method.
Returns
-------
Series
Pandas Series with the calculated feature.
"""
# getting required vibration data
self._get_required_data(period=period, cached_data=cached_data, only_missing=True)
# getting vibration data
vibration_df: DataFrame = self._get_requirement_data("RequiredVibrationData").copy()
# creating series for the result
result = self._create_empty_result(period=period, result_type="Series")
# filtering vibration data
vibration_df = vibration_df[
(vibration_df["sensor"] == self._feature_options["sensor_name"])
& (vibration_df["timestamp"].between(period.start, period.end))
& (vibration_df["object_name"] == self.object)
].copy()
# acquisition frequency
if self._feature_options["acquisition_rate"] is not None:
vibration_df = vibration_df[vibration_df["acquisition_frequency"] == self._feature_options["acquisition_rate"]].copy()
# variable name
if self._feature_options["variable_name"] is not None:
vibration_df = vibration_df[vibration_df["acquisition_frequency"] == self._feature_options["variable_name"]].copy()
# if vibration_df is empty, return empty result
if vibration_df.empty:
logger.debug(f"No vibration data found for object {self.object} in period {period}.")
result = result.dropna()
return result
# reindexing vibration_df with the index in result to match 10 min frequency
vibration_df["timestamp"] = vibration_df["timestamp"].astype("datetime64[s]")
vibration_df = vibration_df.set_index("timestamp")
vibration_df = vibration_df.reindex(result.index, method="nearest", tolerance=timedelta(minutes=4, seconds=59))
vibration_df.index.name = "timestamp"
vibration_df = vibration_df.reset_index()
# dropping all NA values
vibration_df = vibration_df.dropna(subset=["value"])
# converting value to numpy arrays for faster iteration
values = vibration_df["value"].values
# creating empty list to store the results for each timestamp
# this will later be inserted into the result series
temp_result = []
# getting the operation that will be used to aggregate the values in the band
operation_name = self._feature_options["operation"]
# iterating over the vibration data
for i in range(len(values)):
# getting the timeseries
# we use [1] as dimension 0 is the time and 1 is the timeseries values
timeseries: ndarray = values[i][1]
# making the operation
match operation_name:
case "peak":
# getting the peak value
op_result = timeseries.max()
case "peak-to-peak":
# getting the peak-to-peak value
op_result = timeseries.max() - timeseries.min()
case "rms":
# getting the root mean square value
op_result = np.sqrt(np.mean(np.square(timeseries)))
case "mean":
# getting the mean value
op_result = np.mean(timeseries)
case "median":
# getting the median value
op_result = np.median(timeseries)
case "std":
# getting the standard deviation value
op_result = np.std(timeseries)
case _:
raise ValueError(f"Operation '{operation_name}' is not supported.")
temp_result.append(op_result)
# creating a temporary series with the results
result = Series(temp_result, index=vibration_df["timestamp"], name=result.name, dtype="float64")
# dropping NA values
result = result.dropna()
# adding calculated feature to class result attribute
self._result = result.copy()
# saving results
self.save(save_into=save_into, **kwargs)
return result
save(save_into=None, **kwargs)
¶
Method to save the calculated feature values in performance_db.
Parameters:
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.
By default None.
-
(**kwargs¶dict, default:{}) –Not being used at the moment. Here only for compatibility.
Source code in echo_energycalc/feature_calc_core.py
def save(
self,
save_into: Literal["all", "performance_db"] | None = None,
**kwargs, # noqa: ARG002
) -> None:
"""
Method to save the calculated feature values in performance_db.
Parameters
----------
save_into : Literal["all", "performance_db"] | None, optional
Argument that will be passed to the method "save". The options are:
- "all": The feature will be saved in performance_db and bazefield.
- "performance_db": the feature will be saved only in performance_db.
- None: The feature will not be saved.
By default None.
**kwargs : dict, optional
Not being used at the moment. Here only for compatibility.
"""
# checking arguments
if not isinstance(save_into, str | type(None)):
raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")
# checking if calculation was done
if self.result is None:
raise ValueError(
"The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
)
if save_into is None:
return
if isinstance(save_into, str):
if save_into not in ["performance_db", "all"]:
raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
upload_to_bazefield = save_into == "all"
elif save_into is None:
upload_to_bazefield = False
else:
raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")
# converting result series to DataFrame if needed
if isinstance(self.result, Series):
result_df = self.result.to_frame()
elif isinstance(self.result, DataFrame):
result_df = self.result.droplevel(0, axis=1)
else:
raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")
# adjusting DataFrame to be inserted in the database
# making the columns a Multindex with levels object_name and feature_name
result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])
self._perfdb.features.values.series.insert(
df=result_df,
on_conflict="update",
bazefield_upload=upload_to_bazefield,
)