Solar Soiling Loss¶
Overview¶
The SolarEnergyLossSoiling class is a subclass of SolarEnergyLossCalculator and FeatureCalculator that calculates the value of energy loss due to soiling effects on solar panels.
Calculation Logic¶
The calculation works as follows:
- Soiling Ratio Calculation:
- For BRR objects: Uses dual irradiance sensors (
CellIrradiance1 / CellIrradiance2) -
For RBG objects: Uses current on short-circuit measurements (Icc/SolarIrradiance) normalized by 99th percentile performance
-
Event Detection: Applies statistical analysis using centered rolling median (7-day window) to detect cleaning/soiling events when changes exceed
Q3 + 1.5 * IQRthreshold -
Linear Interpolation: Reconstructs clean soiling curve by interpolating between detected events to remove the effect of cleaning activities
-
Monthly Weighted Averaging: Calculates monthly soiling ratios weighted by daily irradiance:
Σ(soiling_ratio × irradiance) / Σ(irradiance). Giving more importance to values on days with high irradiance and less importance to days with low irradiance. As we know that on low irradiance days, the difference between soiled and clean modules might not be relevant. -
Loss Application: Applies monthly ratios to 10-minute theoretical power data:
(1 - monthly_ratio) × theoretical_power
The calculation extends the period to the full year (from January 1st) to better capture seasonal trends and complete data distribution for accurate soiling ratio calculation.
Database Requirements¶
- Feature attribute
server_calc_typemust be set to 'solar_energy_loss_soiling'. - Object must have
reference_soiling_stationsandreference_weather_stationsattributes configured.
Required Data¶
Object Attributes¶
reference_soiling_stations: Soiling measurement station(s)reference_weather_stations: Associated weather station for irradiance data
Features¶
Soiling Station:
CellIrradiance1_5min.AVG,CellIrradiance2_5min.AVG(BRR objects)SoilModuleIccCurrent(RBG objects)
Object:
IrradiancePOAReference_5min.AVG: Plane-of-array irradianceActivePowerTheoretical_10min.AVG: Theoretical power generation
Weather Station:
SoilRateAdjusted_1d.AVG: Calculated adjusted soiling rate (intermediate)
Configuration Constants¶
SOILING_DETECTION_WINDOW = 7 # Days for rolling median
SOILING_THRESHOLD_MULTIPLIER = 1.5 # IQR multiplier for event detection
MIN_IRRADIANCE_THRESHOLD = 500 # W/m² minimum for valid data
NORMALIZATION_PERCENTILE = 0.99 # Percentile for performance normalization
Class Definition¶
SolarEnergyLossSoiling(object_name, feature)
¶
Calculator for solar energy losses due to soiling effects.
This class calculates daily soiling losses based on: 1. Soiling ratio calculated from dual irradiance sensors (for BRR objects) or current measurements (for RBG objects) 2. Linear interpolation between cleaning/soiling events 3. Monthly weighted averages using irradiance as weights 4. Application of monthly ratios to daily theoretical power = LostActivePowerSoiling
The calculation process: - Detects soiling/cleaning events using statistical thresholds (1.5 * IQR of rolling median changes) - Reconstructs clean/soiling curve via linear interpolation - Calculates monthly weighted soiling ratios - Applies monthly ratios to daily theoretical power to get losses for each day
Parameters:
-
(object_name¶str) –Name of the solar asset object in performance_db.
-
(feature¶str) –Name of the soiling loss feature to calculate.
Raises:
-
ValueError–If required object attributes are missing from the database.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def __init__(self, object_name: str, feature: str) -> None:
"""
Initialize the soiling loss calculator.
Parameters
----------
object_name : str
Name of the solar asset object in performance_db.
feature : str
Name of the soiling loss feature to calculate.
Raises
------
ValueError
If required object attributes are missing from the database.
"""
super().__init__(object_name, feature)
# Defining which object attributes are required for the calculation. And getting attributes from database
self._add_requirement(
RequiredObjectAttributes(
{
self.object: [
"reference_soiling_stations",
"reference_weather_stations",
],
},
),
)
self._get_required_data()
# Get weather station references
self._soiling_station = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_soiling_stations"][0]
self._weather_station = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"][
"complete_ws"
]
feature
property
¶
Feature that is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Name of the feature that is calculated.
name
property
¶
Name of the feature calculator. Is defined in child classes of FeatureCalculator.
This must be equal to the "server_calc_type" attribute of the feature in performance_db.
Returns:
-
str–Name of the feature calculator.
object
property
¶
Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Object name for which the feature is calculated.
requirements
property
¶
List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.
Returns:
-
dict[str, list[CalculationRequirement]]–Dict of requirements.
The keys are the names of the classes of the requirements and the values are lists of requirements of that class.
For example:
{"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}
result
property
¶
Result of the calculation. This is None until the method "calculate" is called.
Returns:
-
Series | DataFrame | None:–Result of the calculation if the method "calculate" was called. None otherwise.
calculate(period, save_into=None, cached_data=None, **kwargs)
¶
Calculate daily soiling losses for the specified period.
The calculation process: 1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution 2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events 3. Store calculated SR and adjusted SR in bazefield 4. Get theoretical power and adjusted ratios for target period 5. Calculate monthly weighted soiling ratios 6. Apply monthly ratios to 10min theoretical power
Parameters:
-
(period¶DateTimeRange) –Target period for loss calculation.
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Where to save results. Default is None.
-
(cached_data¶DataFrame | None, default:None) –Pre-calculated data to improve performance. Default is None.
-
(**kwargs¶dict, default:{}) –Additional arguments for the save method.
Returns:
-
Series–Daily soiling losses indexed by date.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def calculate(
self,
period: DateTimeRange,
save_into: Literal["all", "performance_db"] | None = None,
cached_data: DataFrame | None = None,
**kwargs,
) -> Series:
"""
Calculate daily soiling losses for the specified period.
The calculation process:
1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution
2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events
3. Store calculated SR and adjusted SR in bazefield
4. Get theoretical power and adjusted ratios for target period
5. Calculate monthly weighted soiling ratios
6. Apply monthly ratios to 10min theoretical power
Parameters
----------
period : DateTimeRange
Target period for loss calculation.
save_into : Literal["all", "performance_db"] | None, optional
Where to save results. Default is None.
cached_data : DataFrame | None, optional
Pre-calculated data to improve performance. Default is None.
**kwargs : dict
Additional arguments for the save method.
Returns
-------
Series
Daily soiling losses indexed by date.
"""
t0 = perf_counter()
# ------- Defining periods -------
# Getting the last day of the month for period end - this is done to get the whole monthly SR variability
year = period.end.year
month = period.end.month
last_day = monthrange(year, month)[1]
# Soiling Ratio Adjusted calculation period - from Jan 1 to the end of the target month
soiling_period = period.copy()
if "BRR" in self.object:
soiling_period.start = soiling_period.start.replace(
month=1,
day=1,
hour=0,
minute=0,
second=0,
microsecond=0,
)
else:
soiling_period.start = datetime(2024, 4, 1) # inicio dos dados de soiling RBG
soiling_period.end = soiling_period.end.replace(
day=last_day,
hour=23,
minute=59,
second=59,
microsecond=999999,
)
# Calculation period - from start to the end of the target month
calculate_period = period.copy()
calculate_period.start = calculate_period.start.replace(
day=1,
hour=0,
minute=0,
second=0,
microsecond=0,
)
calculate_period.end = calculate_period.end.replace(
day=last_day,
hour=23,
minute=59,
second=59,
microsecond=999999,
)
# ------ 1: Get soiling data and calculate adjusted SR ------
# Setup required features from weather stations with bazefield suffix
soiling_features = (
["CellIrradiance1_5min.AVG", "CellIrradiance2_5min.AVG"]
if "BRR" in self.object
else ["SoilModuleIccCurrent", "SoilModuleIrradiance", "CleanModuleIrradiance"]
)
features = {
self._soiling_station: [f"{feat}_b#" for feat in soiling_features],
self.object: ["IrradiancePOAReference_5min.AVG_b#"],
}
# Getting soiling data for the extended period
self._add_requirement(RequiredFeatures(features=features))
self._get_required_data(
period=soiling_period,
reindex=None,
round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
cached_data=cached_data,
)
soiling_df = self._get_requirement_data("RequiredFeatures")
t1 = perf_counter()
# Adjust dataframe and get SR adjusted values
soiling_df = soiling_df.droplevel(0, axis=1)
soiling_df.columns = soiling_df.columns.str.removesuffix("_b#")
self.soiling_ratio_adjusted(soiling_df)
t2 = perf_counter()
# ------ 2: Get target period data with adjusted ratios -------
self._requirements = None
target_features = {
self._weather_station: ["SoilRateAdjusted_1d.AVG_b#"],
self.object: [
"IrradiancePOAReference_5min.AVG_b#",
"ActivePowerTheoretical_10min.AVG_b#",
],
}
self._add_requirement(RequiredFeatures(features=target_features))
self._get_required_data(
period=calculate_period,
reindex=None,
round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
cached_data=cached_data,
)
target_df = self._get_requirement_data("RequiredFeatures")
# Process target period data
obj_df = target_df[self.object]
obj_df.columns = obj_df.columns.str.removesuffix("_b#")
# ------- 3: Calculate Monthly Soiling Ratio weighted by Irradiance -------
# Reindex irradiance from 5min to 10min to match theoretical power frequency
irradiance_10min = obj_df[["IrradiancePOAReference_5min.AVG"]].resample("10min").mean()
power_10min = obj_df[["ActivePowerTheoretical_10min.AVG"]]
# Combine data at same frequency (10min)
obj_10min = irradiance_10min.join(power_10min, how="inner")
# Aggregate daily for monthly ratio calculation only
daily_power_df = obj_10min.resample("D").sum()
daily_soiling_df = target_df[self._weather_station]
# Clean column names for soiling data
daily_soiling_df.columns = daily_soiling_df.columns.str.removesuffix("_b#")
# Combine daily data for monthly ratio calculation
daily_combined_df = daily_power_df.join(daily_soiling_df, how="inner")
# Calculate monthly weighted soil ratios using daily data
monthly_ratios = self._calculate_monthly_weighted_soiling_ratio(daily_combined_df)
# ------- 4: Calculate Soiling Losses on 10min frequency -------
# Apply monthly ratios to 10-minute data
obj_10min["month"] = obj_10min.index.to_period("M")
obj_10min["monthly_soiling_ratio"] = obj_10min["month"].map(monthly_ratios)
obj_10min["soiling_loss_10min"] = (1 - obj_10min["monthly_soiling_ratio"]) * obj_10min["ActivePowerTheoretical_10min.AVG"]
# Create result from 10-minute losses
result = self._create_empty_result(period=period, freq="10min", result_type="Series")
result = result.fillna(obj_10min["soiling_loss_10min"])
self._result = result.copy()
self.save(save_into=save_into, **kwargs)
logger.debug(
f"{self.object} - {self.feature} - {period}: "
f"Soiling data processing {t1 - t0:.2f}s - "
f"Ratio calculation {t2 - t1:.2f}s - "
f"Loss calculation {perf_counter() - t2:.2f}s",
)
return result
save(save_into=None, **kwargs)
¶
Method to save the calculated feature values in performance_db.
Parameters:
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.
By default None.
-
(**kwargs¶dict, default:{}) –Not being used at the moment. Here only for compatibility.
Source code in echo_energycalc/feature_calc_core.py
def save(
self,
save_into: Literal["all", "performance_db"] | None = None,
**kwargs, # noqa: ARG002
) -> None:
"""
Method to save the calculated feature values in performance_db.
Parameters
----------
save_into : Literal["all", "performance_db"] | None, optional
Argument that will be passed to the method "save". The options are:
- "all": The feature will be saved in performance_db and bazefield.
- "performance_db": the feature will be saved only in performance_db.
- None: The feature will not be saved.
By default None.
**kwargs : dict, optional
Not being used at the moment. Here only for compatibility.
"""
# checking arguments
if not isinstance(save_into, str | type(None)):
raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")
# checking if calculation was done
if self.result is None:
raise ValueError(
"The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
)
if save_into is None:
return
if isinstance(save_into, str):
if save_into not in ["performance_db", "all"]:
raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
upload_to_bazefield = save_into == "all"
elif save_into is None:
upload_to_bazefield = False
else:
raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")
# converting result series to DataFrame if needed
if isinstance(self.result, Series):
result_df = self.result.to_frame()
elif isinstance(self.result, DataFrame):
result_df = self.result.droplevel(0, axis=1)
else:
raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")
# adjusting DataFrame to be inserted in the database
# making the columns a Multindex with levels object_name and feature_name
result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])
self._perfdb.features.values.series.insert(
df=result_df,
on_conflict="update",
bazefield_upload=upload_to_bazefield,
)
soiling_ratio_adjusted(df)
¶
Calculate and store adjusted soiling ratio in bazefield.
This method processes the soiling data, detects events, interpolates clean values, and stores the result in bazefield for later use.
Parameters:
-
(df¶DataFrame) –Input dataframe with soiling sensor data.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def soiling_ratio_adjusted(self, df: DataFrame) -> None:
"""
Calculate and store adjusted soiling ratio in bazefield.
This method processes the soiling data, detects events, interpolates
clean values, and stores the result in bazefield for later use.
Parameters
----------
df : DataFrame
Input dataframe with soiling sensor data.
"""
baze = Baze()
interpolated_ratio = self._calculate_soiling_ratio_brr(df) if "BRR" in self.object else self._calculate_soiling_ratio_rbg(df)
# Create MultiIndex columns for bazefield insertion
columns = MultiIndex.from_tuples(
[(self._weather_station, "SoilRateAdjusted_1d.AVG")],
names=["object_name", "point"],
)
# Insert data into bazefield
data = DataFrame(
data=interpolated_ratio.tolist(),
index=interpolated_ratio.index,
columns=columns,
)
baze.points.values.series.insert(data=data)