Skip to content

Solar Soiling Loss

Overview

SolarEnergyLossSoiling calculates the energy loss attributable to soiling (dust, bird droppings, etc.) on solar panels. The approach is based on detecting cleaning events from sensor data, interpolating a clean soiling ratio curve between those events, and applying monthly irradiance-weighted averages to the theoretical power.

Two sensor configurations are supported:

  • BRR objects: Dual irradiance sensors — soiling ratio = CellIrradiance1 / CellIrradiance2.
  • RBG objects: Short-circuit current measurements — soiling ratio = normalized Icc / Irradiance.

Calculation Logic

Period Expansion

The entire calculation operates on a full-year window starting January 1st, even if a shorter period is requested. This ensures the event detection has enough data distribution to compute stable statistical thresholds.

Step 1 — Soiling Ratio Calculation

For BRR objects (dual irradiance sensors):

  1. Compute raw soiling ratio: SR = CellIrradiance1 / CellIrradiance2.
  2. Filter to the midday window (11:00–13:00) and irradiance above 500 W/m².
  3. Resample to daily mean SR.
  4. Store the raw SR in Bazefield as SoilRateServerCalculated.

For RBG objects (Icc current measurements):

  1. Filter to midday window (10:00–14:00) and clean module irradiance above 500 W/m².
  2. Compute daily performance metric: PM = Σ(Icc) / Σ(Irradiance).
  3. Normalize by the 99th percentile of PM (so 1.0 = best observed performance).
  4. Store normalized PM in Bazefield as SoilRateServerCalculated.

Step 2 — Event Detection and Interpolation

Applied to the daily SR/PM series:

  1. Compute a 7-day centered rolling median of the soiling ratio.
  2. Take the first difference of the rolling median.
  3. Detect events where the absolute change exceeds Q3 + 1.5 × IQR of all changes — these are cleaning or soiling events.
  4. Mark start-of-long-null-runs (≥ 3 consecutive null days, flagged 5 days in advance) as additional events.
  5. Keep only the soiling ratio at event markers; linearly interpolate all other values between events.

The adjusted soiling ratio is stored in Bazefield as SoilRateAdjusted_1d.AVG.

Step 3 — Monthly Weighted Soiling Ratio

For each calendar month, compute an irradiance-weighted average soiling ratio:

Text Only
monthly_ratio = Σ(SoilRateAdjusted × Irradiance) / Σ(Irradiance)

Irradiance weighting gives more importance to high-production days, where soiling losses are largest.

Step 4 — Loss Application

For each 10-minute timestamp in the target period:

Text Only
soiling_loss_kW = (1 - monthly_ratio_for_this_month) × ActivePowerTheoretical_10min.AVG

Database Requirements

Feature Attribute

Attribute Value
server_calc_type solar_energy_loss_soiling

Object Attributes

Attribute Required Description
reference_soiling_stations Yes List of soiling measurement station names. The first entry is used.
reference_weather_stations Yes Dict with "complete_ws" key naming the weather station that stores the adjusted soiling rate in Bazefield.

Features (soiling station — from Bazefield)

BRR objects:

Feature Description
CellIrradiance1_5min.AVG Clean reference cell irradiance (W/m²)
CellIrradiance2_5min.AVG Soiled cell irradiance (W/m²)

RBG objects:

Feature Description
SoilModuleIccCurrent Short-circuit current of soiled module
SoilModuleIrradiance Irradiance at soiled module
CleanModuleIrradiance Irradiance at clean reference module

Features (object — from Bazefield)

Feature Description
IrradiancePOAReference_5min.AVG Plane-of-array reference irradiance (W/m²). Used as irradiance weight in monthly ratio calculation.
ActivePowerTheoretical_10min.AVG Theoretical power (kW). Loss is applied to this.

Features (complete weather station — from Bazefield)

Feature Description
SoilRateAdjusted_1d.AVG Adjusted soiling ratio previously stored by the calculator itself. Read back for the target-period calculation step.

Module-Level Constants

Constant Value Description
SOILING_DETECTION_WINDOW 7 days Rolling window for median-based event detection
SOILING_THRESHOLD_MULTIPLIER 1.5 IQR multiplier for the event detection threshold
MIN_IRRADIANCE_THRESHOLD 500 W/m² Minimum irradiance for valid soiling ratio data
NORMALIZATION_PERCENTILE 0.99 Percentile used to normalize RBG performance metric

Class Definition

SolarEnergyLossSoiling(object_name, feature)

Calculator for solar energy losses due to soiling effects.

This class calculates daily soiling losses based on: 1. Soiling ratio calculated from dual irradiance sensors (for BRR objects) or current measurements (for RBG objects) 2. Linear interpolation between cleaning/soiling events 3. Monthly weighted averages using irradiance as weights 4. Application of monthly ratios to daily theoretical power = LostActivePowerSoiling

The calculation process: - Detects soiling/cleaning events using statistical thresholds (1.5 * IQR of rolling median changes) - Reconstructs clean/soiling curve via linear interpolation - Calculates monthly weighted soiling ratios - Applies monthly ratios to daily theoretical power to get losses for each day

Parameters:

  • object_name

    (str) –

    Name of the solar asset object in performance_db.

  • feature

    (str) –

    Name of the soiling loss feature to calculate.

Raises:

  • ValueError

    If required object attributes are missing from the database.

Source code in echo_energycalc/solar_energy_loss_soiling.py
Python
def __init__(self, object_name: str, feature: str) -> None:
    """
    Initialize the soiling loss calculator.

    Parameters
    ----------
    object_name : str
        Name of the solar asset object in performance_db.
    feature : str
        Name of the soiling loss feature to calculate.

    Raises
    ------
    ValueError
        If required object attributes are missing from the database.
    """
    super().__init__(object_name, feature)

    self._add_requirement(
        RequiredObjectAttributes(
            {
                self.object: [
                    "reference_soiling_stations",
                    "reference_weather_stations",
                ],
            },
        ),
    )
    self._fetch_requirements()

    self._soiling_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_soiling_stations"][0]
    self._weather_station = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["complete_ws"]

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • DataFrame | None

    Polars DataFrame with a "timestamp" column and one or more feature value columns. None until calculate is called.

calculate(period, save_into=None, cached_data=None, **kwargs)

Calculate daily soiling losses for the specified period.

The calculation process: 1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution 2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events 3. Store calculated SR and adjusted SR in bazefield 4. Get theoretical power and adjusted ratios for target period 5. Calculate monthly weighted soiling ratios 6. Apply monthly ratios to 10min theoretical power

Parameters:

  • period

    (DateTimeRange) –

    Target period for loss calculation.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Where to save results. Default is None.

  • cached_data

    (DataFrame | None, default: None ) –

    Pre-calculated data to improve performance. Default is None.

  • **kwargs

    (dict, default: {} ) –

    Additional arguments for the save method.

Returns:

  • DataFrame

    Polars DataFrame with the calculated soiling losses.

Source code in echo_energycalc/solar_energy_loss_soiling.py
Python
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: pl.DataFrame | None = None,
    **kwargs,
) -> pl.DataFrame:
    """
    Calculate daily soiling losses for the specified period.

    The calculation process:
    1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution
    2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events
    3. Store calculated SR and adjusted SR in bazefield
    4. Get theoretical power and adjusted ratios for target period
    5. Calculate monthly weighted soiling ratios
    6. Apply monthly ratios to 10min theoretical power

    Parameters
    ----------
    period : DateTimeRange
        Target period for loss calculation.
    save_into : Literal["all", "performance_db"] | None, optional
        Where to save results. Default is None.
    cached_data : pl.DataFrame | None, optional
        Pre-calculated data to improve performance. Default is None.
    **kwargs : dict
        Additional arguments for the save method.

    Returns
    -------
    pl.DataFrame
        Polars DataFrame with the calculated soiling losses.
    """
    t0 = perf_counter()

    # ------- Defining periods -------
    year = period.end.year
    month = period.end.month
    last_day = monthrange(year, month)[1]

    soiling_period = period.copy()
    soiling_period.start = soiling_period.start.replace(month=1, day=1, hour=0, minute=0, second=0, microsecond=0)
    soiling_period.end = soiling_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)

    calculate_period = period.copy()
    calculate_period.start = calculate_period.start.replace(day=1, hour=0, minute=0, second=0, microsecond=0)
    calculate_period.end = calculate_period.end.replace(day=last_day, hour=23, minute=59, second=59, microsecond=999999)

    # ------ 1: Get soiling data and calculate adjusted SR ------
    soiling_features = (
        ["CellIrradiance1_5min.AVG", "CellIrradiance2_5min.AVG"]
        if "BRR" in self.object
        else ["SoilModuleIccCurrent", "SoilModuleIrradiance", "CleanModuleIrradiance"]
    )
    features = {
        self._soiling_station: [f"{feat}_b#" for feat in soiling_features],
        self.object: ["IrradiancePOAReference_5min.AVG_b#"],
    }
    self._add_requirement(RequiredFeatures(features=features))
    self._fetch_requirements(
        period=soiling_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    soiling_pl = self._requirement_data("RequiredFeatures")
    # Strip "Obj@" prefix and "_b#" suffix from all feature columns
    soiling_pl = soiling_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in soiling_pl.columns if c != "timestamp"})

    t1 = perf_counter()

    # Adjust dataframe and get SR adjusted values (writes to Bazefield as a side effect)
    self.soiling_ratio_adjusted(soiling_pl)

    t2 = perf_counter()

    # ------ 2: Get target period data with adjusted ratios -------
    self._requirements = None
    target_features = {
        self._weather_station: ["SoilRateAdjusted_1d.AVG_b#"],
        self.object: [
            "IrradiancePOAReference_5min.AVG_b#",
            "ActivePowerTheoretical_10min.AVG_b#",
        ],
    }
    self._add_requirement(RequiredFeatures(features=target_features))
    self._fetch_requirements(
        period=calculate_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    target_pl = self._requirement_data("RequiredFeatures")
    # Strip "Obj@" prefix and "_b#" suffix
    target_pl = target_pl.rename({c: c.split("@", 1)[1].removesuffix("_b#") for c in target_pl.columns if c != "timestamp"})

    # ------- 3: Calculate Monthly Soiling Ratio weighted by Irradiance -------

    # Resample irradiance from 5min to 10min to match theoretical power frequency
    irradiance_10min = (
        target_pl.select(["timestamp", "IrradiancePOAReference_5min.AVG"])
        .sort("timestamp")
        .group_by_dynamic("timestamp", every="10m")
        .agg(pl.col("IrradiancePOAReference_5min.AVG").mean())
    )
    power_10min = target_pl.select(["timestamp", "ActivePowerTheoretical_10min.AVG"]).drop_nulls("ActivePowerTheoretical_10min.AVG")
    obj_10min = irradiance_10min.join(power_10min, on="timestamp", how="inner")

    # Aggregate to daily for monthly ratio calculation
    daily_power = (
        obj_10min.sort("timestamp")
        .group_by_dynamic("timestamp", every="1d")
        .agg(
            pl.col("IrradiancePOAReference_5min.AVG").sum(),
            pl.col("ActivePowerTheoretical_10min.AVG").sum(),
        )
    )
    daily_soiling = (
        target_pl.select(["timestamp", "SoilRateAdjusted_1d.AVG"])
        .drop_nulls("SoilRateAdjusted_1d.AVG")
        .sort("timestamp")
        .group_by_dynamic("timestamp", every="1d")
        .agg(pl.col("SoilRateAdjusted_1d.AVG").first())
    )
    daily_combined = daily_power.join(daily_soiling, on="timestamp", how="inner")

    monthly_ratios = self._calculate_monthly_weighted_soiling_ratio(daily_combined)

    # ------- 4: Calculate Soiling Losses on 10min frequency -------
    soiling_loss_pl = (
        obj_10min.with_columns(pl.col("timestamp").dt.truncate("1mo").alias("month_start"))
        .join(monthly_ratios, on="month_start", how="left")
        .with_columns(
            ((1 - pl.col("monthly_ratio")) * pl.col("ActivePowerTheoretical_10min.AVG")).alias("soiling_loss_10min"),
        )
        .select(["timestamp", "soiling_loss_10min"])
    )

    result_pl = (
        self._create_empty_result(period=period, freq="10min", result_type="Series")
        .join(soiling_loss_pl, on="timestamp", how="left")
        .with_columns(pl.col("soiling_loss_10min").alias(self.feature))
        .select(["timestamp", self.feature])
    )

    self._result = result_pl
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: "
        f"Soiling data processing {t1 - t0:.2f}s - "
        f"Ratio calculation {t2 - t1:.2f}s - "
        f"Loss calculation {perf_counter() - t2:.2f}s",
    )

    return result_pl

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
Python
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Please call 'calculate' before calling 'save'.",
        )

    if save_into is None:
        return

    upload_to_bazefield = save_into == "all"

    if not isinstance(self.result, pl.DataFrame):
        raise TypeError(f"result must be a polars DataFrame, not {type(self.result)}.")
    if "timestamp" not in self.result.columns:
        raise ValueError("result DataFrame must contain a 'timestamp' column.")

    # rename feature columns to "object@feature" format expected by perfdb polars insert
    feat_cols = [c for c in self.result.columns if c != "timestamp"]
    result_pl = self.result.rename({col: f"{self.object}@{col}" for col in feat_cols})

    self._perfdb.features.values.series.insert(
        df=result_pl,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )

soiling_ratio_adjusted(df)

Calculate and store adjusted soiling ratio in Bazefield.

This method processes the soiling data, detects events, interpolates clean values, and stores the result in Bazefield for later use.

Parameters:

  • df

    (DataFrame) –

    Input DataFrame with soiling sensor data and a "timestamp" column.

Source code in echo_energycalc/solar_energy_loss_soiling.py
Python
def soiling_ratio_adjusted(self, df: pl.DataFrame) -> None:
    """
    Calculate and store adjusted soiling ratio in Bazefield.

    This method processes the soiling data, detects events, interpolates
    clean values, and stores the result in Bazefield for later use.

    Parameters
    ----------
    df : pl.DataFrame
        Input DataFrame with soiling sensor data and a ``"timestamp"`` column.
    """
    baze = Baze()

    interpolated = self._calculate_soiling_ratio_brr(df) if "BRR" in self.object else self._calculate_soiling_ratio_rbg(df)
    ratio_col = interpolated.columns[1]  # "SoilRatio" or "PM"

    self._insert_to_baze(
        baze,
        interpolated.with_columns(pl.col(ratio_col).clip(upper_bound=1.0).alias("value")).select(["timestamp", "value"]),
        "SoilRateAdjusted_1d.AVG",
    )