Skip to content

Solar Soiling Loss

Overview

The SolarEnergyLossSoiling class is a subclass of SolarEnergyLossCalculator and FeatureCalculator that calculates the value of energy loss due to soiling effects on solar panels.

Calculation Logic

The calculation works as follows:

  1. Soiling Ratio Calculation:
  2. For BRR objects: Uses dual irradiance sensors (CellIrradiance1 / CellIrradiance2)
  3. For RBG objects: Uses current on short-circuit measurements (Icc/SolarIrradiance) normalized by 99th percentile performance

  4. Event Detection: Applies statistical analysis using centered rolling median (7-day window) to detect cleaning/soiling events when changes exceed Q3 + 1.5 * IQR threshold

  5. Linear Interpolation: Reconstructs clean soiling curve by interpolating between detected events to remove the effect of cleaning activities

  6. Monthly Weighted Averaging: Calculates monthly soiling ratios weighted by daily irradiance: Σ(soiling_ratio × irradiance) / Σ(irradiance). Giving more importance to values on days with high irradiance and less importance to days with low irradiance. As we know that on low irradiance days, the difference between soiled and clean modules might not be relevant.

  7. Loss Application: Applies monthly ratios to 10-minute theoretical power data: (1 - monthly_ratio) × theoretical_power

The calculation extends the period to the full year (from January 1st) to better capture seasonal trends and complete data distribution for accurate soiling ratio calculation.

Database Requirements

  • Feature attribute server_calc_type must be set to 'solar_energy_loss_soiling'.
  • Object must have reference_soiling_stations and reference_weather_stations attributes configured.

Required Data

Object Attributes

  • reference_soiling_stations: Soiling measurement station(s)
  • reference_weather_stations: Associated weather station for irradiance data

Features

Soiling Station:

  • CellIrradiance1_5min.AVG, CellIrradiance2_5min.AVG (BRR objects)
  • SoilModuleIccCurrent (RBG objects)

Object:

  • IrradiancePOAReference_5min.AVG: Plane-of-array irradiance
  • ActivePowerTheoretical_10min.AVG: Theoretical power generation

Weather Station:

  • SoilRateAdjusted_1d.AVG: Calculated adjusted soiling rate (intermediate)

Configuration Constants

SOILING_DETECTION_WINDOW = 7           # Days for rolling median
SOILING_THRESHOLD_MULTIPLIER = 1.5     # IQR multiplier for event detection
MIN_IRRADIANCE_THRESHOLD = 500         # W/m² minimum for valid data
NORMALIZATION_PERCENTILE = 0.99        # Percentile for performance normalization

Class Definition

SolarEnergyLossSoiling(object_name, feature)

Calculator for solar energy losses due to soiling effects.

This class calculates daily soiling losses based on: 1. Soiling ratio calculated from dual irradiance sensors (for BRR objects) or current measurements (for RBG objects) 2. Linear interpolation between cleaning/soiling events 3. Monthly weighted averages using irradiance as weights 4. Application of monthly ratios to daily theoretical power = LostActivePowerSoiling

The calculation process: - Detects soiling/cleaning events using statistical thresholds (1.5 * IQR of rolling median changes) - Reconstructs clean/soiling curve via linear interpolation - Calculates monthly weighted soiling ratios - Applies monthly ratios to daily theoretical power to get losses for each day

Parameters:

  • object_name

    (str) –

    Name of the solar asset object in performance_db.

  • feature

    (str) –

    Name of the soiling loss feature to calculate.

Raises:

  • ValueError

    If required object attributes are missing from the database.

Source code in echo_energycalc/solar_energy_loss_soiling.py
def __init__(self, object_name: str, feature: str) -> None:
    """
    Initialize the soiling loss calculator.

    Parameters
    ----------
    object_name : str
        Name of the solar asset object in performance_db.
    feature : str
        Name of the soiling loss feature to calculate.

    Raises
    ------
    ValueError
        If required object attributes are missing from the database.
    """
    super().__init__(object_name, feature)

    # Defining which object attributes are required for the calculation. And getting attributes from database
    self._add_requirement(
        RequiredObjectAttributes(
            {
                self.object: [
                    "reference_soiling_stations",
                    "reference_weather_stations",
                ],
            },
        ),
    )
    self._get_required_data()

    # Get weather station references
    self._soiling_station = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_soiling_stations"][0]
    self._weather_station = self._get_requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"][
        "complete_ws"
    ]

feature property

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Name of the feature that is calculated.

name property

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

  • str

    Name of the feature calculator.

object property

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

  • str

    Object name for which the feature is calculated.

requirements property

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

  • dict[str, list[CalculationRequirement]]

    Dict of requirements.

    The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

    For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

result property

Result of the calculation. This is None until the method "calculate" is called.

Returns:

  • Series | DataFrame | None:

    Result of the calculation if the method "calculate" was called. None otherwise.

calculate(period, save_into=None, cached_data=None, **kwargs)

Calculate daily soiling losses for the specified period.

The calculation process: 1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution 2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events 3. Store calculated SR and adjusted SR in bazefield 4. Get theoretical power and adjusted ratios for target period 5. Calculate monthly weighted soiling ratios 6. Apply monthly ratios to 10min theoretical power

Parameters:

  • period

    (DateTimeRange) –

    Target period for loss calculation.

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Where to save results. Default is None.

  • cached_data

    (DataFrame | None, default: None ) –

    Pre-calculated data to improve performance. Default is None.

  • **kwargs

    (dict, default: {} ) –

    Additional arguments for the save method.

Returns:

  • Series

    Daily soiling losses indexed by date.

Source code in echo_energycalc/solar_energy_loss_soiling.py
def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: DataFrame | None = None,
    **kwargs,
) -> Series:
    """
    Calculate daily soiling losses for the specified period.

    The calculation process:
    1. Get soiling sensor data for entire year (from Jan 1) - to better capture seasonal trends and get the whole data distribution
    2. Calculate adjusted soiling ratios by detecting clean/soiling events and trace a interpolated line between events
    3. Store calculated SR and adjusted SR in bazefield
    4. Get theoretical power and adjusted ratios for target period
    5. Calculate monthly weighted soiling ratios
    6. Apply monthly ratios to 10min theoretical power

    Parameters
    ----------
    period : DateTimeRange
        Target period for loss calculation.
    save_into : Literal["all", "performance_db"] | None, optional
        Where to save results. Default is None.
    cached_data : DataFrame | None, optional
        Pre-calculated data to improve performance. Default is None.
    **kwargs : dict
        Additional arguments for the save method.

    Returns
    -------
    Series
        Daily soiling losses indexed by date.
    """
    t0 = perf_counter()

    # ------- Defining periods -------

    # Getting the last day of the month for period end - this is done to get the whole monthly SR variability
    year = period.end.year
    month = period.end.month
    last_day = monthrange(year, month)[1]
    # Soiling Ratio Adjusted calculation period - from Jan 1 to the end of the target month
    soiling_period = period.copy()
    if "BRR" in self.object:
        soiling_period.start = soiling_period.start.replace(
            month=1,
            day=1,
            hour=0,
            minute=0,
            second=0,
            microsecond=0,
        )
    else:
        soiling_period.start = datetime(2024, 4, 1)  # inicio dos dados de soiling RBG
    soiling_period.end = soiling_period.end.replace(
        day=last_day,
        hour=23,
        minute=59,
        second=59,
        microsecond=999999,
    )
    # Calculation period - from start to the end of the target month
    calculate_period = period.copy()
    calculate_period.start = calculate_period.start.replace(
        day=1,
        hour=0,
        minute=0,
        second=0,
        microsecond=0,
    )

    calculate_period.end = calculate_period.end.replace(
        day=last_day,
        hour=23,
        minute=59,
        second=59,
        microsecond=999999,
    )

    # ------ 1: Get soiling data and calculate adjusted SR ------

    # Setup required features from weather stations with bazefield suffix
    soiling_features = (
        ["CellIrradiance1_5min.AVG", "CellIrradiance2_5min.AVG"]
        if "BRR" in self.object
        else ["SoilModuleIccCurrent", "SoilModuleIrradiance", "CleanModuleIrradiance"]
    )
    features = {
        self._soiling_station: [f"{feat}_b#" for feat in soiling_features],
        self.object: ["IrradiancePOAReference_5min.AVG_b#"],
    }
    # Getting soiling data for the extended period
    self._add_requirement(RequiredFeatures(features=features))
    self._get_required_data(
        period=soiling_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    soiling_df = self._get_requirement_data("RequiredFeatures")

    t1 = perf_counter()

    # Adjust dataframe and get SR adjusted values
    soiling_df = soiling_df.droplevel(0, axis=1)
    soiling_df.columns = soiling_df.columns.str.removesuffix("_b#")
    self.soiling_ratio_adjusted(soiling_df)

    t2 = perf_counter()

    # ------ 2: Get target period data with adjusted ratios -------
    self._requirements = None
    target_features = {
        self._weather_station: ["SoilRateAdjusted_1d.AVG_b#"],
        self.object: [
            "IrradiancePOAReference_5min.AVG_b#",
            "ActivePowerTheoretical_10min.AVG_b#",
        ],
    }
    self._add_requirement(RequiredFeatures(features=target_features))
    self._get_required_data(
        period=calculate_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )
    target_df = self._get_requirement_data("RequiredFeatures")

    # Process target period data
    obj_df = target_df[self.object]
    obj_df.columns = obj_df.columns.str.removesuffix("_b#")

    # ------- 3: Calculate Monthly Soiling Ratio weighted by Irradiance -------

    # Reindex irradiance from 5min to 10min to match theoretical power frequency
    irradiance_10min = obj_df[["IrradiancePOAReference_5min.AVG"]].resample("10min").mean()
    power_10min = obj_df[["ActivePowerTheoretical_10min.AVG"]]
    # Combine data at same frequency (10min)
    obj_10min = irradiance_10min.join(power_10min, how="inner")

    # Aggregate daily for monthly ratio calculation only
    daily_power_df = obj_10min.resample("D").sum()
    daily_soiling_df = target_df[self._weather_station]

    # Clean column names for soiling data
    daily_soiling_df.columns = daily_soiling_df.columns.str.removesuffix("_b#")

    # Combine daily data for monthly ratio calculation
    daily_combined_df = daily_power_df.join(daily_soiling_df, how="inner")

    # Calculate monthly weighted soil ratios using daily data
    monthly_ratios = self._calculate_monthly_weighted_soiling_ratio(daily_combined_df)

    # ------- 4: Calculate Soiling Losses on 10min frequency -------
    # Apply monthly ratios to 10-minute data
    obj_10min["month"] = obj_10min.index.to_period("M")
    obj_10min["monthly_soiling_ratio"] = obj_10min["month"].map(monthly_ratios)
    obj_10min["soiling_loss_10min"] = (1 - obj_10min["monthly_soiling_ratio"]) * obj_10min["ActivePowerTheoretical_10min.AVG"]

    # Create result from 10-minute losses
    result = self._create_empty_result(period=period, freq="10min", result_type="Series")
    result = result.fillna(obj_10min["soiling_loss_10min"])

    self._result = result.copy()
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: "
        f"Soiling data processing {t1 - t0:.2f}s - "
        f"Ratio calculation {t2 - t1:.2f}s - "
        f"Loss calculation {perf_counter() - t2:.2f}s",
    )

    return result

save(save_into=None, **kwargs)

Method to save the calculated feature values in performance_db.

Parameters:

  • save_into

    (Literal['all', 'performance_db'] | None, default: None ) –

    Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

    By default None.

  • **kwargs

    (dict, default: {} ) –

    Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py
def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Cannot save the feature calculation results. Please make sure to do something like 'self._result = df[self.feature].copy()' in the method 'calculate' before calling 'self.save()'.",
        )

    if save_into is None:
        return

    if isinstance(save_into, str):
        if save_into not in ["performance_db", "all"]:
            raise ValueError(f"save_into must be 'performance_db' or 'all', not {save_into}.")
        upload_to_bazefield = save_into == "all"
    elif save_into is None:
        upload_to_bazefield = False
    else:
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}.")

    # converting result series to DataFrame if needed
    if isinstance(self.result, Series):
        result_df = self.result.to_frame()
    elif isinstance(self.result, DataFrame):
        result_df = self.result.droplevel(0, axis=1)
    else:
        raise TypeError(f"result must be a pandas Series or DataFrame, not {type(self.result)}.")

    # adjusting DataFrame to be inserted in the database
    # making the columns a Multindex with levels object_name and feature_name
    result_df.columns = MultiIndex.from_product([[self.object], result_df.columns], names=["object_name", "feature_name"])

    self._perfdb.features.values.series.insert(
        df=result_df,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )

soiling_ratio_adjusted(df)

Calculate and store adjusted soiling ratio in bazefield.

This method processes the soiling data, detects events, interpolates clean values, and stores the result in bazefield for later use.

Parameters:

  • df

    (DataFrame) –

    Input dataframe with soiling sensor data.

Source code in echo_energycalc/solar_energy_loss_soiling.py
def soiling_ratio_adjusted(self, df: DataFrame) -> None:
    """
    Calculate and store adjusted soiling ratio in bazefield.

    This method processes the soiling data, detects events, interpolates
    clean values, and stores the result in bazefield for later use.

    Parameters
    ----------
    df : DataFrame
        Input dataframe with soiling sensor data.
    """
    baze = Baze()

    interpolated_ratio = self._calculate_soiling_ratio_brr(df) if "BRR" in self.object else self._calculate_soiling_ratio_rbg(df)

    # Create MultiIndex columns for bazefield insertion
    columns = MultiIndex.from_tuples(
        [(self._weather_station, "SoilRateAdjusted_1d.AVG")],
        names=["object_name", "point"],
    )

    # Insert data into bazefield
    data = DataFrame(
        data=interpolated_ratio.tolist(),
        index=interpolated_ratio.index,
        columns=columns,
    )
    baze.points.values.series.insert(data=data)