Solar Soiling Loss¶

Overview¶

SolarEnergyLossSoiling calculates the energy loss attributable to soiling (dust, bird droppings, etc.) on solar panels. The approach is based on detecting cleaning events from sensor data, interpolating a clean soiling ratio curve between those events, and applying monthly irradiance-weighted averages to the theoretical power.

Two sensor configurations are supported:

BRR objects: Dual irradiance sensors — soiling ratio = CellIrradiance1 / CellIrradiance2.
RBG objects: Short-circuit current measurements — soiling ratio = normalized Icc / Irradiance.

Calculation Logic¶

Period Expansion¶

The entire calculation operates on a full-year window starting January 1^st, even if a shorter period is requested. This ensures the event detection has enough data distribution to compute stable statistical thresholds.

Step 1 — Soiling Ratio Calculation¶

For BRR objects (dual irradiance sensors):

Compute raw soiling ratio: SR = CellIrradiance1 / CellIrradiance2.
Filter to the midday window (11:00–13:00) and irradiance above 500 W/m².
Resample to daily mean SR.
Store the raw SR in Bazefield as SoilRateServerCalculated.

For RBG objects (Icc current measurements):

Filter to midday window (10:00–14:00) and clean module irradiance above 500 W/m².
Compute daily performance metric: PM = Σ(Icc) / Σ(Irradiance).
Normalize by the 99^th percentile of PM (so 1.0 = best observed performance).
Store normalized PM in Bazefield as SoilRateServerCalculated.

Step 2 — Event Detection and Interpolation¶

Applied to the daily SR/PM series:

Compute a 7-day centered rolling median of the soiling ratio.
Take the first difference of the rolling median.
Detect events where the absolute change exceeds Q3 + 1.5 × IQR of all changes — these are cleaning or soiling events.
Mark start-of-long-null-runs (≥ 3 consecutive null days, flagged 5 days in advance) as additional events.
Keep only the soiling ratio at event markers; linearly interpolate all other values between events.

The adjusted soiling ratio is stored in Bazefield as SoilRateAdjusted_1d.AVG.

Step 3 — Monthly Weighted Soiling Ratio¶

For each calendar month, compute an irradiance-weighted average soiling ratio:

Text Only

monthly_ratio = Σ(SoilRateAdjusted × Irradiance) / Σ(Irradiance)

Irradiance weighting gives more importance to high-production days, where soiling losses are largest.

Step 4 — Loss Application¶

For each 10-minute timestamp in the target period:

Text Only

soiling_loss_kW = (1 - monthly_ratio_for_this_month) × ActivePowerTheoretical_10min.AVG

Database Requirements¶

Feature Attribute¶

Attribute	Value
`server_calc_type`	`solar_energy_loss_soiling`

Object Attributes¶

Attribute	Required	Description
`reference_soiling_stations`	Yes	List of soiling measurement station names. The first entry is used.
`reference_weather_stations`	Yes	Dict with `"complete_ws"` key naming the weather station that stores the adjusted soiling rate in Bazefield.

Features (soiling station — from Bazefield)¶

BRR objects:

Feature	Description
`CellIrradiance1_5min.AVG`	Clean reference cell irradiance (W/m²)
`CellIrradiance2_5min.AVG`	Soiled cell irradiance (W/m²)

RBG objects:

Feature	Description
`SoilModuleIccCurrent`	Short-circuit current of soiled module
`SoilModuleIrradiance`	Irradiance at soiled module
`CleanModuleIrradiance`	Irradiance at clean reference module

Features (object — from Bazefield)¶

Feature	Description
`IrradiancePOAReference_5min.AVG`	Plane-of-array reference irradiance (W/m²). Used as irradiance weight in monthly ratio calculation.
`ActivePowerTheoretical_10min.AVG`	Theoretical power (kW). Loss is applied to this.

Features (complete weather station — from Bazefield)¶

Feature	Description
`SoilRateAdjusted_1d.AVG`	Adjusted soiling ratio previously stored by the calculator itself. Read back for the target-period calculation step.

Module-Level Constants¶

Constant	Value	Description
`SOILING_DETECTION_WINDOW`	7 days	Rolling window for median-based event detection
`SOILING_THRESHOLD_MULTIPLIER`	1.5	IQR multiplier for the event detection threshold
`MIN_IRRADIANCE_THRESHOLD`	500 W/m²	Minimum irradiance for valid soiling ratio data
`NORMALIZATION_PERCENTILE`	0.99	Percentile used to normalize RBG performance metric

Class Definition¶

`SolarEnergyLossSoiling(object_name, feature)` ¶

Calculator for solar energy losses due to soiling effects.

This class calculates daily soiling losses based on: 1. Soiling ratio calculated from dual irradiance sensors (for BRR objects) or current measurements (for RBG objects) 2. Linear interpolation between cleaning/soiling events 3. Monthly weighted averages using irradiance as weights 4. Application of monthly ratios to daily theoretical power = LostActivePowerSoiling

The calculation process: - Detects soiling/cleaning events using statistical thresholds (1.5 * IQR of rolling median changes) - Reconstructs clean/soiling curve via linear interpolation - Calculates monthly weighted soiling ratios - Applies monthly ratios to daily theoretical power to get losses for each day

Parameters:

object_name ¶
(str) –

Name of the solar asset object in performance_db.
feature ¶
(str) –

Name of the soiling loss feature to calculate.

Raises:

ValueError –

If required object attributes are missing from the database.

Source code in echo_energycalc/solar_energy_loss_soiling.py

Python

def __init__(self, object_name: str, feature: str) -> None:
    """
    Initialize the soiling loss calculator.

    Parameters
    ----------
    object_name : str
        Name of the solar asset object in performance_db.
    feature : str
        Name of the soiling loss feature to calculate.

    Raises
    ------
    ValueError
        If required object attributes are missing from the database.
    """
    super().__init__(object_name, feature)

    self._add_requirement(
        RequiredObjectAttributes(
            {
                self.object: [
                    "reference_soiling_stations",
                    "reference_weather_stations",
                ],
            },
        ),
    )
    self._fetch_requirements()

    self._soiling_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_soiling_stations"][0]
    self._weather_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["complete_ws"]

`feature` `property` ¶

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

str –

Name of the feature that is calculated.

`name` `property` ¶

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

str –

Name of the feature calculator.

`object` `property` ¶

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

str –

Object name for which the feature is calculated.

`requirements` `property` ¶

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

dict[str, list[CalculationRequirement]] –

Dict of requirements.

The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

`result` `property` ¶

Result of the calculation. This is None until the method "calculate" is called.

Returns:

DataFrame | None –

Polars DataFrame with a "timestamp" column and one or more feature value columns. None until calculate is called.

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

Calculate daily soiling losses for the specified period.

The calculation process: 1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution 2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events 3. Store calculated SR and adjusted SR in bazefield 4. Get theoretical power and adjusted ratios for target period 5. Calculate monthly weighted soiling ratios 6. Apply monthly ratios to 10min theoretical power

Parameters:

period ¶
(DateTimeRange) –

Target period for loss calculation.
save_into ¶
(Literal['all', 'performance_db'] | None, default: None ) –

Where to save results. Default is None.
cached_data ¶
(DataFrame | None, default: None ) –

Pre-calculated data to improve performance. Default is None.
**kwargs ¶
(dict, default: {} ) –

Additional arguments for the save method.

Returns:

DataFrame –

Polars DataFrame with the calculated soiling losses.

Source code in echo_energycalc/solar_energy_loss_soiling.py

Python

def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: pl.DataFrame | None = None,
    **kwargs,
) -> pl.DataFrame:
    """
    Calculate daily soiling losses for the specified period.

    The calculation process:
    1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution
    2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events
    3. Store calculated SR and adjusted SR in bazefield
    4. Get theoretical power and adjusted ratios for target period
    5. Calculate monthly weighted soiling ratios
    6. Apply monthly ratios to 10min theoretical power

    Parameters
    ----------
    period : DateTimeRange
        Target period for loss calculation.
    save_into : Literal["all", "performance_db"] | None, optional
        Where to save results. Default is None.
    cached_data : pl.DataFrame | None, optional
        Pre-calculated data to improve performance. Default is None.
    **kwargs : dict
        Additional arguments for the save method.

    Returns
    -------
    pl.DataFrame
        Polars DataFrame with the calculated soiling losses.
    """
    t0 = perf_counter()

    # ------- Defining periods -------
    year = period.end.year
    month = period.end.month
    last_day = monthrange(year, month)[1]

    soiling_period = period.copy()
    soiling_period.start = soiling_period.start.replace(month=1, day=1, hour=0, minute=0, second=0, microsecond=0)
    soiling_period.end = soiling_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)

    calculate_period = period.copy()
    calculate_period.start = calculate_period.start.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
    calculate_period.end = calculate_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)

    # ------ 1: Get soiling data and calculate adjusted SR ------
    soiling_features = (
        ["CellIrradiance1_5min.AVG", "CellIrradiance2_5min.AVG"]
        if "BRR" in self.object
        else ["SoilModuleIccCurrent", "SoilModuleIrradiance", "CleanModuleIrradiance"]
    )
    features = {
        self._soiling_station: [f"{feat}_b#" for feat in soiling_features],
        self.object: ["IrradiancePOAReference_5min.AVG_b#"],
    }
    self._add_requirement(RequiredFeatures(features=features))
    self._fetch_requirements(
        period=soiling_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    soiling_pl = self._requirement_data("RequiredFeatures")
    # Strip "Obj@" prefix and "_b#" suffix from all feature columns
    soiling_pl = soiling_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in soiling_pl.columns if c != "timestamp"})

    t1 = perf_counter()

    # Adjust dataframe and get SR adjusted values (writes to Bazefield as a side effect)
    self.soiling_ratio_adjusted(soiling_pl)

    t2 = perf_counter()

    # ------ 2: Get target period data with adjusted ratios -------
    self._requirements = None
    target_features = {
        self._weather_station: ["SoilRateAdjusted_1d.AVG_b#"],
        self.object: [
            "IrradiancePOAReference_5min.AVG_b#",
            "ActivePowerTheoretical_10min.AVG_b#",
        ],
    }
    self._add_requirement(RequiredFeatures(features=target_features))
    self._fetch_requirements(
        period=calculate_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    target_pl = self._requirement_data("RequiredFeatures")
    # Strip "Obj@" prefix and "_b#" suffix
    target_pl = target_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in target_pl.columns if c != "timestamp"})

    # ------- 3: Calculate Monthly Soiling Ratio weighted by Irradiance -------

    # Resample irradiance from 5min to 10min to match theoretical power frequency
    irradiance_10min = (
        target_pl.select(["timestamp", "IrradiancePOAReference_5min.AVG"])
        .sort("timestamp")
        .group_by_dynamic("timestamp", every="10m")
        .agg(pl.col("IrradiancePOAReference_5min.AVG").mean())
    )
    power_10min = target_pl.select(["timestamp", "ActivePowerTheoretical_10min.AVG"]).drop_nulls("ActivePowerTheoretical_10min.AVG")
    obj_10min = irradiance_10min.join(power_10min, on="timestamp", how="inner")

    # Aggregate to daily for monthly ratio calculation
    daily_power = (
        obj_10min.sort("timestamp")
        .group_by_dynamic("timestamp", every="1d")
        .agg(
            pl.col("IrradiancePOAReference_5min.AVG").sum(),
            pl.col("ActivePowerTheoretical_10min.AVG").sum(),
        )
    )
    daily_soiling = (
        target_pl.select(["timestamp", "SoilRateAdjusted_1d.AVG"])
        .drop_nulls("SoilRateAdjusted_1d.AVG")
        .sort("timestamp")
        .group_by_dynamic("timestamp", every="1d")
        .agg(pl.col("SoilRateAdjusted_1d.AVG").first())
    )
    daily_combined = daily_power.join(daily_soiling, on="timestamp", how="inner")

    monthly_ratios = self._calculate_monthly_weighted_soiling_ratio(daily_combined)

    # ------- 4: Calculate Soiling Losses on 10min frequency -------
    soiling_loss_pl = (
        obj_10min.with_columns(pl.col("timestamp").dt.truncate("1mo").alias("month_start"))
        .join(monthly_ratios, on="month_start", how="left")
        .with_columns(
            ((1 - pl.col("monthly_ratio")) * pl.col("ActivePowerTheoretical_10min.AVG")).alias("soiling_loss_10min"),
        )
        .select(["timestamp", "soiling_loss_10min"])
    )

    result_pl = (
        self._create_empty_result(period=period, freq="10min", result_type="Series")
        .join(soiling_loss_pl, on="timestamp", how="left")
        .with_columns(pl.col("soiling_loss_10min").alias(self.feature))
        .select(["timestamp", self.feature])
    )

    self._result = result_pl
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: "
        f"Soiling data processing {t1 - t0:.2f}s - "
        f"Ratio calculation {t2 - t1:.2f}s - "
        f"Loss calculation {perf_counter() - t2:.2f}s",
    )

    return result_pl

`save(save_into=None, **kwargs)` ¶

Method to save the calculated feature values in performance_db.

Parameters:

save_into ¶
(Literal['all', 'performance_db'] | None, default: None ) –

Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

By default None.
**kwargs ¶
(dict, default: {} ) –

Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py

Python

def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Please call 'calculate' before calling 'save'.",
        )

    if save_into is None:
        return

    upload_to_bazefield = save_into == "all"

    if not isinstance(self.result, pl.DataFrame):
        raise TypeError(f"result must be a polars DataFrame, not {type(self.result)}.")
    if "timestamp" not in self.result.columns:
        raise ValueError("result DataFrame must contain a 'timestamp' column.")

    # rename feature columns to "object@feature" format expected by perfdb polars insert
    feat_cols = [c for c in self.result.columns if c != "timestamp"]
    result_pl = self.result.rename({col: f"{self.object}@{col}" for col in feat_cols})

    self._perfdb.features.values.series.insert(
        df=result_pl,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )

`soiling_ratio_adjusted(df)` ¶

Calculate and store adjusted soiling ratio in Bazefield.

This method processes the soiling data, detects events, interpolates clean values, and stores the result in Bazefield for later use.

Parameters:

df ¶
(DataFrame) –

Input DataFrame with soiling sensor data and a "timestamp" column.

Source code in echo_energycalc/solar_energy_loss_soiling.py

Python

def soiling_ratio_adjusted(self, df: pl.DataFrame) -> None:
    """
    Calculate and store adjusted soiling ratio in Bazefield.

    This method processes the soiling data, detects events, interpolates
    clean values, and stores the result in Bazefield for later use.

    Parameters
    ----------
    df : pl.DataFrame
        Input DataFrame with soiling sensor data and a ``"timestamp"`` column.
    """
    baze = Baze()

    interpolated = self._calculate_soiling_ratio_brr(df) if "BRR" in self.object else self._calculate_soiling_ratio_rbg(df)
    ratio_col = interpolated.columns[1]  # "SoilRatio" or "PM"

    self._insert_to_baze(
        baze,
        interpolated.with_columns(pl.col(ratio_col).clip(upper_bound=1.0).alias("value")).select(["timestamp", "value"]),
        "SoilRateAdjusted_1d.AVG",
    )

Solar Soiling Loss¶

Overview¶

Calculation Logic¶

Period Expansion¶

Step 1 — Soiling Ratio Calculation¶

Step 2 — Event Detection and Interpolation¶

Step 3 — Monthly Weighted Soiling Ratio¶

Step 4 — Loss Application¶

Database Requirements¶

Feature Attribute¶

Object Attributes¶

Features (soiling station — from Bazefield)¶

Features (object — from Bazefield)¶

Features (complete weather station — from Bazefield)¶

Module-Level Constants¶

Class Definition¶

`SolarEnergyLossSoiling(object_name, feature)` ¶

`object_name` ¶

`feature` ¶

`feature` `property` ¶

`name` `property` ¶

`object` `property` ¶

`requirements` `property` ¶

`result` `property` ¶

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

`period` ¶

`save_into` ¶

`cached_data` ¶

`kwargs`** ¶

`save(save_into=None, **kwargs)` ¶

`save_into` ¶

`kwargs`** ¶

`soiling_ratio_adjusted(df)` ¶

`df` ¶

Solar Soiling Loss¶

Overview¶

Calculation Logic¶

Period Expansion¶

Step 1 — Soiling Ratio Calculation¶

Step 2 — Event Detection and Interpolation¶

Step 3 — Monthly Weighted Soiling Ratio¶

Step 4 — Loss Application¶

Database Requirements¶

Feature Attribute¶

Object Attributes¶

Features (soiling station — from Bazefield)¶

Features (object — from Bazefield)¶

Features (complete weather station — from Bazefield)¶

Module-Level Constants¶

Class Definition¶

SolarEnergyLossSoiling(object_name, feature) ¶

object_name ¶

feature ¶

feature property ¶

name property ¶

object property ¶

requirements property ¶

result property ¶

calculate(period, save_into=None, cached_data=None, **kwargs) ¶

period ¶

save_into ¶

cached_data ¶

**kwargs ¶

save(save_into=None, **kwargs) ¶

save_into ¶

**kwargs ¶

soiling_ratio_adjusted(df) ¶

df ¶

`SolarEnergyLossSoiling(object_name, feature)` ¶

`object_name` ¶

`feature` ¶

`feature` `property` ¶

`name` `property` ¶

`object` `property` ¶

`requirements` `property` ¶

`result` `property` ¶

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

`period` ¶

`save_into` ¶

`cached_data` ¶

`kwargs`** ¶

`save(save_into=None, **kwargs)` ¶

`save_into` ¶

`kwargs`** ¶

`soiling_ratio_adjusted(df)` ¶

`df` ¶