Solar Soiling Loss¶
Overview¶
SolarEnergyLossSoiling calculates the energy loss attributable to soiling (dust, bird droppings, etc.) on solar panels. The approach is based on detecting cleaning events from sensor data, interpolating a clean soiling ratio curve between those events, and applying monthly irradiance-weighted averages to the theoretical power.
Two sensor configurations are supported:
- BRR objects: Dual irradiance sensors — soiling ratio =
CellIrradiance1 / CellIrradiance2. - RBG objects: Short-circuit current measurements — soiling ratio = normalized
Icc / Irradiance.
Calculation Logic¶
Period Expansion¶
The entire calculation operates on a full-year window starting January 1st, even if a shorter period is requested. This ensures the event detection has enough data distribution to compute stable statistical thresholds.
Step 1 — Soiling Ratio Calculation¶
For BRR objects (dual irradiance sensors):
- Compute raw soiling ratio:
SR = CellIrradiance1 / CellIrradiance2. - Filter to the midday window (
11:00–13:00) and irradiance above 500 W/m². - Resample to daily mean SR.
- Store the raw SR in Bazefield as
SoilRateServerCalculated.
For RBG objects (Icc current measurements):
- Filter to midday window (
10:00–14:00) and clean module irradiance above 500 W/m². - Compute daily performance metric:
PM = Σ(Icc) / Σ(Irradiance). - Normalize by the 99th percentile of PM (so 1.0 = best observed performance).
- Store normalized PM in Bazefield as
SoilRateServerCalculated.
Step 2 — Event Detection and Interpolation¶
Applied to the daily SR/PM series:
- Compute a 7-day centered rolling median of the soiling ratio.
- Take the first difference of the rolling median.
- Detect events where the absolute change exceeds
Q3 + 1.5 × IQRof all changes — these are cleaning or soiling events. - Mark start-of-long-null-runs (≥ 3 consecutive null days, flagged 5 days in advance) as additional events.
- Keep only the soiling ratio at event markers; linearly interpolate all other values between events.
The adjusted soiling ratio is stored in Bazefield as SoilRateAdjusted_1d.AVG.
Step 3 — Monthly Weighted Soiling Ratio¶
For each calendar month, compute an irradiance-weighted average soiling ratio:
monthly_ratio = Σ(SoilRateAdjusted × Irradiance) / Σ(Irradiance)
Irradiance weighting gives more importance to high-production days, where soiling losses are largest.
Step 4 — Loss Application¶
For each 10-minute timestamp in the target period:
soiling_loss_kW = (1 - monthly_ratio_for_this_month) × ActivePowerTheoretical_10min.AVG
Database Requirements¶
Feature Attribute¶
| Attribute | Value |
|---|---|
server_calc_type |
solar_energy_loss_soiling |
Object Attributes¶
| Attribute | Required | Description |
|---|---|---|
reference_soiling_stations |
Yes | List of soiling measurement station names. The first entry is used. |
reference_weather_stations |
Yes | Dict with "complete_ws" key naming the weather station that stores the adjusted soiling rate in Bazefield. |
Features (soiling station — from Bazefield)¶
BRR objects:
| Feature | Description |
|---|---|
CellIrradiance1_5min.AVG |
Clean reference cell irradiance (W/m²) |
CellIrradiance2_5min.AVG |
Soiled cell irradiance (W/m²) |
RBG objects:
| Feature | Description |
|---|---|
SoilModuleIccCurrent |
Short-circuit current of soiled module |
SoilModuleIrradiance |
Irradiance at soiled module |
CleanModuleIrradiance |
Irradiance at clean reference module |
Features (object — from Bazefield)¶
| Feature | Description |
|---|---|
IrradiancePOAReference_5min.AVG |
Plane-of-array reference irradiance (W/m²). Used as irradiance weight in monthly ratio calculation. |
ActivePowerTheoretical_10min.AVG |
Theoretical power (kW). Loss is applied to this. |
Features (complete weather station — from Bazefield)¶
| Feature | Description |
|---|---|
SoilRateAdjusted_1d.AVG |
Adjusted soiling ratio previously stored by the calculator itself. Read back for the target-period calculation step. |
Module-Level Constants¶
| Constant | Value | Description |
|---|---|---|
SOILING_DETECTION_WINDOW |
7 days | Rolling window for median-based event detection |
SOILING_THRESHOLD_MULTIPLIER |
1.5 | IQR multiplier for the event detection threshold |
MIN_IRRADIANCE_THRESHOLD |
500 W/m² | Minimum irradiance for valid soiling ratio data |
NORMALIZATION_PERCENTILE |
0.99 | Percentile used to normalize RBG performance metric |
Class Definition¶
SolarEnergyLossSoiling(object_name, feature)
¶
Calculator for solar energy losses due to soiling effects.
This class calculates daily soiling losses based on: 1. Soiling ratio calculated from dual irradiance sensors (for BRR objects) or current measurements (for RBG objects) 2. Linear interpolation between cleaning/soiling events 3. Monthly weighted averages using irradiance as weights 4. Application of monthly ratios to daily theoretical power = LostActivePowerSoiling
The calculation process: - Detects soiling/cleaning events using statistical thresholds (1.5 * IQR of rolling median changes) - Reconstructs clean/soiling curve via linear interpolation - Calculates monthly weighted soiling ratios - Applies monthly ratios to daily theoretical power to get losses for each day
Parameters:
-
(object_name¶str) –Name of the solar asset object in performance_db.
-
(feature¶str) –Name of the soiling loss feature to calculate.
Raises:
-
ValueError–If required object attributes are missing from the database.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def __init__(self, object_name: str, feature: str) -> None:
"""
Initialize the soiling loss calculator.
Parameters
----------
object_name : str
Name of the solar asset object in performance_db.
feature : str
Name of the soiling loss feature to calculate.
Raises
------
ValueError
If required object attributes are missing from the database.
"""
super().__init__(object_name, feature)
self._add_requirement(
RequiredObjectAttributes(
{
self.object: [
"reference_soiling_stations",
"reference_weather_stations",
],
},
),
)
self._fetch_requirements()
self._soiling_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_soiling_stations"][0]
self._weather_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["complete_ws"]
feature
property
¶
Feature that is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Name of the feature that is calculated.
name
property
¶
Name of the feature calculator. Is defined in child classes of FeatureCalculator.
This must be equal to the "server_calc_type" attribute of the feature in performance_db.
Returns:
-
str–Name of the feature calculator.
object
property
¶
Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.
Returns:
-
str–Object name for which the feature is calculated.
requirements
property
¶
List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.
Returns:
-
dict[str, list[CalculationRequirement]]–Dict of requirements.
The keys are the names of the classes of the requirements and the values are lists of requirements of that class.
For example:
{"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}
result
property
¶
Result of the calculation. This is None until the method "calculate" is called.
Returns:
-
DataFrame | None–Polars DataFrame with a
"timestamp"column and one or more feature value columns. None untilcalculateis called.
calculate(period, save_into=None, cached_data=None, **kwargs)
¶
Calculate daily soiling losses for the specified period.
The calculation process: 1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution 2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events 3. Store calculated SR and adjusted SR in bazefield 4. Get theoretical power and adjusted ratios for target period 5. Calculate monthly weighted soiling ratios 6. Apply monthly ratios to 10min theoretical power
Parameters:
-
(period¶DateTimeRange) –Target period for loss calculation.
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Where to save results. Default is None.
-
(cached_data¶DataFrame | None, default:None) –Pre-calculated data to improve performance. Default is None.
-
(**kwargs¶dict, default:{}) –Additional arguments for the save method.
Returns:
-
DataFrame–Polars DataFrame with the calculated soiling losses.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def calculate(
self,
period: DateTimeRange,
save_into: Literal["all", "performance_db"] | None = None,
cached_data: pl.DataFrame | None = None,
**kwargs,
) -> pl.DataFrame:
"""
Calculate daily soiling losses for the specified period.
The calculation process:
1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution
2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events
3. Store calculated SR and adjusted SR in bazefield
4. Get theoretical power and adjusted ratios for target period
5. Calculate monthly weighted soiling ratios
6. Apply monthly ratios to 10min theoretical power
Parameters
----------
period : DateTimeRange
Target period for loss calculation.
save_into : Literal["all", "performance_db"] | None, optional
Where to save results. Default is None.
cached_data : pl.DataFrame | None, optional
Pre-calculated data to improve performance. Default is None.
**kwargs : dict
Additional arguments for the save method.
Returns
-------
pl.DataFrame
Polars DataFrame with the calculated soiling losses.
"""
t0 = perf_counter()
# ------- Defining periods -------
year = period.end.year
month = period.end.month
last_day = monthrange(year, month)[1]
soiling_period = period.copy()
soiling_period.start = soiling_period.start.replace(month=1, day=1, hour=0, minute=0, second=0, microsecond=0)
soiling_period.end = soiling_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)
calculate_period = period.copy()
calculate_period.start = calculate_period.start.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
calculate_period.end = calculate_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)
# ------ 1: Get soiling data and calculate adjusted SR ------
soiling_features = (
["CellIrradiance1_5min.AVG", "CellIrradiance2_5min.AVG"]
if "BRR" in self.object
else ["SoilModuleIccCurrent", "SoilModuleIrradiance", "CleanModuleIrradiance"]
)
features = {
self._soiling_station: [f"{feat}_b#" for feat in soiling_features],
self.object: ["IrradiancePOAReference_5min.AVG_b#"],
}
self._add_requirement(RequiredFeatures(features=features))
self._fetch_requirements(
period=soiling_period,
reindex=None,
round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
cached_data=cached_data,
)
soiling_pl = self._requirement_data("RequiredFeatures")
# Strip "Obj@" prefix and "_b#" suffix from all feature columns
soiling_pl = soiling_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in soiling_pl.columns if c != "timestamp"})
t1 = perf_counter()
# Adjust dataframe and get SR adjusted values (writes to Bazefield as a side effect)
self.soiling_ratio_adjusted(soiling_pl)
t2 = perf_counter()
# ------ 2: Get target period data with adjusted ratios -------
self._requirements = None
target_features = {
self._weather_station: ["SoilRateAdjusted_1d.AVG_b#"],
self.object: [
"IrradiancePOAReference_5min.AVG_b#",
"ActivePowerTheoretical_10min.AVG_b#",
],
}
self._add_requirement(RequiredFeatures(features=target_features))
self._fetch_requirements(
period=calculate_period,
reindex=None,
round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
cached_data=cached_data,
)
target_pl = self._requirement_data("RequiredFeatures")
# Strip "Obj@" prefix and "_b#" suffix
target_pl = target_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in target_pl.columns if c != "timestamp"})
# ------- 3: Calculate Monthly Soiling Ratio weighted by Irradiance -------
# Resample irradiance from 5min to 10min to match theoretical power frequency
irradiance_10min = (
target_pl.select(["timestamp", "IrradiancePOAReference_5min.AVG"])
.sort("timestamp")
.group_by_dynamic("timestamp", every="10m")
.agg(pl.col("IrradiancePOAReference_5min.AVG").mean())
)
power_10min = target_pl.select(["timestamp", "ActivePowerTheoretical_10min.AVG"]).drop_nulls("ActivePowerTheoretical_10min.AVG")
obj_10min = irradiance_10min.join(power_10min, on="timestamp", how="inner")
# Aggregate to daily for monthly ratio calculation
daily_power = (
obj_10min.sort("timestamp")
.group_by_dynamic("timestamp", every="1d")
.agg(
pl.col("IrradiancePOAReference_5min.AVG").sum(),
pl.col("ActivePowerTheoretical_10min.AVG").sum(),
)
)
daily_soiling = (
target_pl.select(["timestamp", "SoilRateAdjusted_1d.AVG"])
.drop_nulls("SoilRateAdjusted_1d.AVG")
.sort("timestamp")
.group_by_dynamic("timestamp", every="1d")
.agg(pl.col("SoilRateAdjusted_1d.AVG").first())
)
daily_combined = daily_power.join(daily_soiling, on="timestamp", how="inner")
monthly_ratios = self._calculate_monthly_weighted_soiling_ratio(daily_combined)
# ------- 4: Calculate Soiling Losses on 10min frequency -------
soiling_loss_pl = (
obj_10min.with_columns(pl.col("timestamp").dt.truncate("1mo").alias("month_start"))
.join(monthly_ratios, on="month_start", how="left")
.with_columns(
((1 - pl.col("monthly_ratio")) * pl.col("ActivePowerTheoretical_10min.AVG")).alias("soiling_loss_10min"),
)
.select(["timestamp", "soiling_loss_10min"])
)
result_pl = (
self._create_empty_result(period=period, freq="10min", result_type="Series")
.join(soiling_loss_pl, on="timestamp", how="left")
.with_columns(pl.col("soiling_loss_10min").alias(self.feature))
.select(["timestamp", self.feature])
)
self._result = result_pl
self.save(save_into=save_into, **kwargs)
logger.debug(
f"{self.object} - {self.feature} - {period}: "
f"Soiling data processing {t1 - t0:.2f}s - "
f"Ratio calculation {t2 - t1:.2f}s - "
f"Loss calculation {perf_counter() - t2:.2f}s",
)
return result_pl
save(save_into=None, **kwargs)
¶
Method to save the calculated feature values in performance_db.
Parameters:
-
(save_into¶Literal['all', 'performance_db'] | None, default:None) –Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.
By default None.
-
(**kwargs¶dict, default:{}) –Not being used at the moment. Here only for compatibility.
Source code in echo_energycalc/feature_calc_core.py
def save(
self,
save_into: Literal["all", "performance_db"] | None = None,
**kwargs, # noqa: ARG002
) -> None:
"""
Method to save the calculated feature values in performance_db.
Parameters
----------
save_into : Literal["all", "performance_db"] | None, optional
Argument that will be passed to the method "save". The options are:
- "all": The feature will be saved in performance_db and bazefield.
- "performance_db": the feature will be saved only in performance_db.
- None: The feature will not be saved.
By default None.
**kwargs : dict, optional
Not being used at the moment. Here only for compatibility.
"""
# checking arguments
if not isinstance(save_into, str | type(None)):
raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")
# checking if calculation was done
if self.result is None:
raise ValueError(
"The calculation was not done. Please call 'calculate' before calling 'save'.",
)
if save_into is None:
return
upload_to_bazefield = save_into == "all"
if not isinstance(self.result, pl.DataFrame):
raise TypeError(f"result must be a polars DataFrame, not {type(self.result)}.")
if "timestamp" not in self.result.columns:
raise ValueError("result DataFrame must contain a 'timestamp' column.")
# rename feature columns to "object@feature" format expected by perfdb polars insert
feat_cols = [c for c in self.result.columns if c != "timestamp"]
result_pl = self.result.rename({col: f"{self.object}@{col}" for col in feat_cols})
self._perfdb.features.values.series.insert(
df=result_pl,
on_conflict="update",
bazefield_upload=upload_to_bazefield,
)
soiling_ratio_adjusted(df)
¶
Calculate and store adjusted soiling ratio in Bazefield.
This method processes the soiling data, detects events, interpolates clean values, and stores the result in Bazefield for later use.
Parameters:
-
(df¶DataFrame) –Input DataFrame with soiling sensor data and a
"timestamp"column.
Source code in echo_energycalc/solar_energy_loss_soiling.py
def soiling_ratio_adjusted(self, df: pl.DataFrame) -> None:
"""
Calculate and store adjusted soiling ratio in Bazefield.
This method processes the soiling data, detects events, interpolates
clean values, and stores the result in Bazefield for later use.
Parameters
----------
df : pl.DataFrame
Input DataFrame with soiling sensor data and a ``"timestamp"`` column.
"""
baze = Baze()
interpolated = self._calculate_soiling_ratio_brr(df) if "BRR" in self.object else self._calculate_soiling_ratio_rbg(df)
ratio_col = interpolated.columns[1] # "SoilRatio" or "PM"
self._insert_to_baze(
baze,
interpolated.with_columns(pl.col(ratio_col).clip(upper_bound=1.0).alias("value")).select(["timestamp", "value"]),
"SoilRateAdjusted_1d.AVG",
)