Solar Resource Loss¶

Overview¶

SolarEnergyLossResource calculates the daily energy loss (or gain) attributable to irradiance deviation from target (P50 or other Pxx). The output is a daily average power value (average kW over the day), positive meaning lower-than-target irradiance and negative meaning above-target irradiance.

A pre-trained linear polynomial regression model (trained on PVSyst simulations) converts irradiance to expected energy at the Connection Point level. Using the same model for both measured and target irradiance ensures that any bias in the model cancels out in the difference.

The training script is at manual_routines\solar_resource_loss on the Performance Server.

Calculation Logic¶

1. Irradiance Acquisition¶

Fetches IrradiancePOACommOk_5min.AVG from all simple weather stations listed in reference_weather_stations["simple_ws"] (can be a list). Timestamps are rounded to 5-minute boundaries (±2 min tolerance).

2. Averaging Across Weather Stations¶

If multiple simple weather stations are configured, the irradiance values are averaged across all stations for each timestamp.

3. Complete Day Filtering¶

The data is resampled to hourly frequency. Only days with exactly 24 hourly observations are kept — incomplete days (due to data gaps) are discarded and logged as warnings.

4. Night Value Adjustment¶

Uses pvlib with the object's latitude and longitude to identify night periods (sun below horizon). Null irradiance values during nighttime are replaced with 0.0.

5. Daily Aggregation¶

Hourly irradiance values are summed to a daily total (equivalent to Wh/m² × h factor).

6. Measured Energy Prediction¶

The regression model is applied to the daily irradiance totals:

Text Only

measured_energy_kWh = model.predict(daily_irradiance_sum)
measured_kW avg = measured_energy_kWh / 24

7. Target Energy Retrieval¶

For each day in the period:

Queries the active KPI target Pxx (e.g., P50) and evaluation period from performance_db.
Retrieves the corresponding target daily irradiance from the resource assessments table (resourceassessments.pxx.get).

Applies the same regression model to the target irradiance to get target energy:

Text Only

target_irradiance_daily = target_point_value × 24
target_kW avg = model.predict(target_irradiance_daily) / 24

The period can span multiple target configurations (e.g., different Pxx or evaluation period in January vs. December), and each segment is processed independently.

8. Loss Calculation¶

Text Only

resource_loss_kW avg = target_kW avg - measured_kW avg

Positive values = measured irradiance was below target (energy loss). Negative values = measured irradiance was above target (energy gain).

Database Requirements¶

Feature Attribute¶

Attribute	Value
`server_calc_type`	`solar_energy_loss_resource`
`feature_options_json`	JSON object — see below

`feature_options_json` Schema¶

Key	Type	Required	Description
`calc_model_type`	string	Yes	Exact model type (e.g., `"solar_resource_fit"`).
`model_name`	string	Yes	Substring of the model name in performance_db (e.g., `"solar_resource_regression"`).
`bazefield_features`	boolean	Yes	If `true`, fetches features from Bazefield.

Example:

JSON

{
    "calc_model_type": "solar_resource_fit",
    "model_name": "solar_resource_regression",
    "bazefield_features": true
}

Object Attributes¶

Attribute	Required	Description
`reference_weather_stations`	Yes	Dict with `"simple_ws"` key (string or list of strings) naming weather station(s) for irradiance.
`latitude`	Yes	Geographic latitude (decimal degrees). Used for night masking via `pvlib`.
`longitude`	Yes	Geographic longitude (decimal degrees). Used for night masking via `pvlib`.

Calculation Model¶

Requirement	Description
Model type	Must match `calc_model_type` exactly
Model name	Must contain `model_name` as a substring
Input	Daily irradiance sum (W/m² × hours equivalent)
Output	Daily expected energy at Connection Point (kWh/day)

Features (simple weather station)¶

Feature	Description
`IrradiancePOACommOk_5min.AVG`	Plane-of-array irradiance (W/m²). Fetched with `_b#` if `bazefield_features = true`.

Performance DB Tables¶

Table / View	Used for
`kpis.energy.targets`	Retrieve active Pxx target and evaluation period for the period
`resourceassessments.pxx`	Retrieve daily target irradiance for the active Pxx

Class Definition¶

`SolarEnergyLossResource(object_name, feature)` ¶

Base class for solar energy loss/gain from Irradiance.

For this class to work, the feature must have the attribute feature_options_json with the following keys:

'calc_model_type': type of the model that will be used to calculate the feature. It must match the type of the model in performance_db.
'model_name': name of the model that will be used to calculate the feature.
'bazefield_features': bool indicating if the required features needs to be acquired from bazefield.

Parameters:

object_name ¶
(str) –

Name of the object for which the feature is calculated. It must exist in performance_db.
feature ¶
(str) –

Feature of the object that is calculated. It must exist in performance_db.

Source code in echo_energycalc/solar_energy_loss_resource.py

Python

def __init__(self, object_name: str, feature: str) -> None:
    """
    Class used to calculate features that depend on a PredictiveModel.

    For this class to work, the feature must have the attribute `feature_options_json` with the following keys:

    - 'calc_model_type': type of the model that will be used to calculate the feature. It must match the type of the model in performance_db.
    - 'model_name': name of the model that will be used to calculate the feature.
    - 'bazefield_features': bool indicating if the required features needs to be acquired from bazefield.

    Parameters
    ----------
    object_name : str
        Name of the object for which the feature is calculated. It must exist in performance_db.
    feature : str
        Feature of the object that is calculated. It must exist in performance_db.
    """
    # initialize parent class
    super().__init__(object_name, feature)

    # load feature options, model requirements, and deserialize joblib model
    self._setup_model_from_feature_options()

    # defining required features
    simple_ws = self._requirement_data("RequiredObjectAttributes")[self.object]["reference_weather_stations"]["simple_ws"]
    features = {ws: ["IrradiancePOACommOk_5min.AVG"] for ws in simple_ws}

    # Adding suffix _b# to features if bazefield_features is True
    if self._feature_attributes["feature_options_json"].get("bazefield_features", False):
        features = {obj: [f"{feat}_b#" for feat in feats] for obj, feats in features.items()}
    self._add_requirement(RequiredFeatures(features=features))

`feature` `property` ¶

Feature that is calculated. This will be defined in the constructor and cannot be changed.

Returns:

str –

Name of the feature that is calculated.

`name` `property` ¶

Name of the feature calculator. Is defined in child classes of FeatureCalculator.

This must be equal to the "server_calc_type" attribute of the feature in performance_db.

Returns:

str –

Name of the feature calculator.

`object` `property` ¶

Object for which the feature is calculated. This will be defined in the constructor and cannot be changed.

Returns:

str –

Object name for which the feature is calculated.

`requirements` `property` ¶

List of requirements of the feature calculator. Is defined in child classes of FeatureCalculator.

Returns:

dict[str, list[CalculationRequirement]] –

Dict of requirements.

The keys are the names of the classes of the requirements and the values are lists of requirements of that class.

For example: {"RequiredFeatures": [RequiredFeatures(...), RequiredFeatures(...)], "RequiredObjects": [RequiredObjects(...)]}

`result` `property` ¶

Result of the calculation. This is None until the method "calculate" is called.

Returns:

DataFrame | None –

Polars DataFrame with a "timestamp" column and one or more feature value columns. None until calculate is called.

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

Method that will calculate the feature.

This code will do the following: 1. Get irradiance data from the weather stations associated with the object. 2. Average the irradiance data from all weather stations. 3. Resample the data to daily frequency, keeping only complete days (24 hours of data). 4. Predict the energy production using the model. 5. Get the target energy production from performance_db (P50) as used in current budget. 6. Calculate the energy loss as the difference between the target energy production and the predicted energy production.

Parameters:

period ¶
(DateTimeRange) –

Period for which the feature will be calculated.
save_into ¶
(Literal['all', 'performance_db'] | None, default: None ) –

Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

By default None.
cached_data ¶
(DataFrame | None, default: None ) –

DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient. By default None
**kwargs ¶
(dict, default: {} ) –

Additional arguments that will be passed to the "save" method.

Returns:

DataFrame –

Polars DataFrame with the calculated feature.

Source code in echo_energycalc/solar_energy_loss_resource.py

Python

def calculate(
    self,
    period: DateTimeRange,
    save_into: Literal["all", "performance_db"] | None = None,
    cached_data: pl.DataFrame | None = None,
    **kwargs,
) -> pl.DataFrame:
    """
    Method that will calculate the feature.

    This code will do the following:
    1. Get irradiance data from the weather stations associated with the object.
    2. Average the irradiance data from all weather stations.
    3. Resample the data to daily frequency, keeping only complete days (24 hours of data).
    4. Predict the energy production using the model.
    5. Get the target energy production from performance_db (P50) as used in current budget.
    6. Calculate the energy loss as the difference between the target energy production and the predicted energy production.

    Parameters
    ----------
    period : DateTimeRange
        Period for which the feature will be calculated.
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    cached_data : pl.DataFrame | None, optional
        DataFrame with features already queried/calculated. This is useful to avoid needing to query all the data again from performance_db, making chained calculations a lot more efficient.
        By default None
    **kwargs : dict, optional
        Additional arguments that will be passed to the "save" method.

    Returns
    -------
    pl.DataFrame
        Polars DataFrame with the calculated feature.
    """
    t0 = perf_counter()
    adjusted_period = period.copy()

    # Getting feature values
    self._fetch_requirements(
        period=adjusted_period,
        reindex=None,
        round_timestamps={"freq": timedelta(minutes=5), "tolerance": timedelta(minutes=2)},
        cached_data=cached_data,
    )

    t1 = perf_counter()

    # Average irradiance across weather stations → polars DataFrame with ["timestamp", "IrradiancePOACommOk_5min.AVG"]
    features_pl = self._requirement_data("RequiredFeatures")
    avg_pl = (
        self._average_weather_station_features(features_pl, "IrradiancePOACommOk_5min.AVG")
        .rename({"IrradiancePOACommOk_5min.AVG": "GlobInc"})
        .sort("timestamp")
    )

    # Resample to hourly by taking the mean within each hour
    hourly_df = avg_pl.group_by_dynamic("timestamp", every="1h").agg(pl.col("GlobInc").mean())

    # Identify complete days (those with exactly 24 hourly rows)
    daily_counts = hourly_df.group_by_dynamic("timestamp", every="1d").agg(pl.len().alias("n_hours"))
    complete_day_starts = daily_counts.filter(pl.col("n_hours") == 24)["timestamp"]

    hourly_complete = (
        hourly_df.with_columns(pl.col("timestamp").dt.truncate("1d").alias("_day"))
        .filter(pl.col("_day").is_in(complete_day_starts.implode()))
        .drop("_day")
    )

    # Log discarded days
    all_day_starts = avg_pl.with_columns(pl.col("timestamp").dt.truncate("1d").alias("_day"))["_day"].unique()
    discarded = set(all_day_starts.to_list()) - set(complete_day_starts.to_list())
    if discarded:
        logger.warning(
            f"{self.object} - {self.feature} - {period}: Discarded days due to less than 24 hours of data: "
            f"{', '.join(str(d.date()) for d in sorted(discarded))}",
        )

    # Zero NaN irradiance at night
    obj_attrs = self._requirement_data("RequiredObjectAttributes")[self.object]
    is_night = self._get_night_mask(hourly_complete["timestamp"], obj_attrs["latitude"], obj_attrs["longitude"])
    hourly_complete = hourly_complete.with_columns(
        pl.when(is_night & pl.col("GlobInc").is_null()).then(pl.lit(0.0)).otherwise(pl.col("GlobInc")).alias("GlobInc"),
    )

    # Sum to daily irradiance (Wh/m²/day equivalent)
    daily_df = hourly_complete.group_by_dynamic("timestamp", every="1d").agg(pl.col("GlobInc").sum())

    t2 = perf_counter()

    if daily_df.is_empty():
        result_pl = self._create_empty_result(period=adjusted_period, freq="1d", result_type="Series")
        self._result = result_pl
        self.save(save_into=save_into, **kwargs)
        return result_pl

    # Model prediction of measured daily energy (kWh/day), then convert to kWmed (average kW over the day)
    predicted_kwh = self._model.predict(daily_df.select("GlobInc").to_numpy())
    daily_df = daily_df.with_columns(
        (pl.Series("_pred", predicted_kwh) / 24.0).alias("predicted_kWmed"),
    )

    # KPI: get which target Pxx is used for this period
    target_pxx_pl = self._perfdb.kpis.energy.targets.get(
        period=adjusted_period,
        time_res="daily",
        object_or_group_names=[self.object],
        measurement_points=["Connection Point"],
        values_only=True,
        output_type="pl.DataFrame",
    )

    # Process each unique combination of target_pxx and target_evaluation_period
    # This allows handling periods that span multiple target configurations (e.g., Dec-2025 to Jan-2026)
    target_energy_parts: list[pl.DataFrame] = []
    groups = target_pxx_pl.select(["target_pxx", "target_evaluation_period"]).unique()

    for row in groups.iter_rows(named=True):
        target_pxx_value = row["target_pxx"]
        target_eval_period = row["target_evaluation_period"]

        group_dates = target_pxx_pl.filter(
            (pl.col("target_pxx") == target_pxx_value) & (pl.col("target_evaluation_period") == target_eval_period),
        )["date"]

        # Query target irradiance for this specific pxx and evaluation period
        target_irr_pl = self._perfdb.resourceassessments.pxx.get(
            period=adjusted_period,
            time_res="daily",
            pxx=[target_pxx_value],
            evaluation_periods=[target_eval_period],
            group_names=[self.object],
            resource_types=["solar_irradiance_poa"],
            output_type="pl.DataFrame",
        )

        # Filter to dates belonging to this target_pxx group
        target_irr_pl = target_irr_pl.filter(pl.col("date").is_in(group_dates.cast(pl.Datetime("ms"))))
        if target_irr_pl.is_empty():
            continue

        # Daily total irradiance (multiply point value by 24 to get daily sum)
        target_irr_arr = target_irr_pl["value"].to_numpy() * 24
        # Predict target energy using the same model, then convert to kWmed
        target_energy_arr = self._model.predict(target_irr_arr.reshape(-1, 1)) / 24

        target_energy_parts.append(
            pl.DataFrame(
                {
                    "timestamp": target_irr_pl["date"].cast(pl.Datetime("ms")),
                    "target_kWmed": pl.Series(target_energy_arr.ravel()),
                }
            ),
        )

    if not target_energy_parts:
        result_pl = self._create_empty_result(period=adjusted_period, freq="1d", result_type="Series")
        self._result = result_pl
        self.save(save_into=save_into, **kwargs)
        return result_pl

    # Combine all target energy segments and compute loss = target - measured
    target_energy_pl = pl.concat(target_energy_parts).sort("timestamp")

    result_pl = (
        target_energy_pl.join(daily_df.select(["timestamp", "predicted_kWmed"]), on="timestamp", how="left")
        .with_columns((pl.col("target_kWmed") - pl.col("predicted_kWmed")).alias(self.feature))
        .filter((pl.col("timestamp") >= adjusted_period.start) & (pl.col("timestamp") < adjusted_period.end))
        .select(["timestamp", self.feature])
    )

    t3 = perf_counter()

    self._result = result_pl
    self.save(save_into=save_into, **kwargs)

    logger.debug(
        f"{self.object} - {self.feature} - {period}: Requirements during calc {t1 - t0:.2f}s - Data adjustments {t2 - t1:.2f}s - Model prediction {t3 - t2:.2f}s - Saving data {perf_counter() - t3:.2f}s",
    )

    return result_pl

`save(save_into=None, **kwargs)` ¶

Method to save the calculated feature values in performance_db.

Parameters:

save_into ¶
(Literal['all', 'performance_db'] | None, default: None ) –

Argument that will be passed to the method "save". The options are: - "all": The feature will be saved in performance_db and bazefield. - "performance_db": the feature will be saved only in performance_db. - None: The feature will not be saved.

By default None.
**kwargs ¶
(dict, default: {} ) –

Not being used at the moment. Here only for compatibility.

Source code in echo_energycalc/feature_calc_core.py

Python

def save(
    self,
    save_into: Literal["all", "performance_db"] | None = None,
    **kwargs,  # noqa: ARG002
) -> None:
    """
    Method to save the calculated feature values in performance_db.

    Parameters
    ----------
    save_into : Literal["all", "performance_db"] | None, optional
        Argument that will be passed to the method "save". The options are:
        - "all": The feature will be saved in performance_db and bazefield.
        - "performance_db": the feature will be saved only in performance_db.
        - None: The feature will not be saved.

        By default None.
    **kwargs : dict, optional
        Not being used at the moment. Here only for compatibility.
    """
    # checking arguments
    if not isinstance(save_into, str | type(None)):
        raise TypeError(f"save_into must be a string or None, not {type(save_into)}")
    if isinstance(save_into, str) and save_into not in ["all", "performance_db"]:
        raise ValueError(f"save_into must be 'all', 'performance_db' or None, not {save_into}")

    # checking if calculation was done
    if self.result is None:
        raise ValueError(
            "The calculation was not done. Please call 'calculate' before calling 'save'.",
        )

    if save_into is None:
        return

    upload_to_bazefield = save_into == "all"

    if not isinstance(self.result, pl.DataFrame):
        raise TypeError(f"result must be a polars DataFrame, not {type(self.result)}.")
    if "timestamp" not in self.result.columns:
        raise ValueError("result DataFrame must contain a 'timestamp' column.")

    # rename feature columns to "object@feature" format expected by perfdb polars insert
    feat_cols = [c for c in self.result.columns if c != "timestamp"]
    result_pl = self.result.rename({col: f"{self.object}@{col}" for col in feat_cols})

    self._perfdb.features.values.series.insert(
        df=result_pl,
        on_conflict="update",
        bazefield_upload=upload_to_bazefield,
    )

Solar Resource Loss¶

Overview¶

Calculation Logic¶

1. Irradiance Acquisition¶

2. Averaging Across Weather Stations¶

3. Complete Day Filtering¶

4. Night Value Adjustment¶

5. Daily Aggregation¶

6. Measured Energy Prediction¶

7. Target Energy Retrieval¶

8. Loss Calculation¶

Database Requirements¶

Feature Attribute¶

`feature_options_json` Schema¶

Object Attributes¶

Calculation Model¶

Features (simple weather station)¶

Performance DB Tables¶

Class Definition¶

`SolarEnergyLossResource(object_name, feature)` ¶

`object_name` ¶

`feature` ¶

`feature` `property` ¶

`name` `property` ¶

`object` `property` ¶

`requirements` `property` ¶

`result` `property` ¶

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

`period` ¶

`save_into` ¶

`cached_data` ¶

`kwargs`** ¶

`save(save_into=None, **kwargs)` ¶

`save_into` ¶

`kwargs`** ¶

Solar Resource Loss¶

Overview¶

Calculation Logic¶

1. Irradiance Acquisition¶

2. Averaging Across Weather Stations¶

3. Complete Day Filtering¶

4. Night Value Adjustment¶

5. Daily Aggregation¶

6. Measured Energy Prediction¶

7. Target Energy Retrieval¶

8. Loss Calculation¶

Database Requirements¶

Feature Attribute¶

feature_options_json Schema¶

Object Attributes¶

Calculation Model¶

Features (simple weather station)¶

Performance DB Tables¶

Class Definition¶

SolarEnergyLossResource(object_name, feature) ¶

object_name ¶

feature ¶

feature property ¶

name property ¶

object property ¶

requirements property ¶

result property ¶

calculate(period, save_into=None, cached_data=None, **kwargs) ¶

period ¶

save_into ¶

cached_data ¶

**kwargs ¶

save(save_into=None, **kwargs) ¶

save_into ¶

**kwargs ¶

`feature_options_json` Schema¶

`SolarEnergyLossResource(object_name, feature)` ¶

`object_name` ¶

`feature` ¶

`feature` `property` ¶

`name` `property` ¶

`object` `property` ¶

`requirements` `property` ¶

`result` `property` ¶

`calculate(period, save_into=None, cached_data=None, **kwargs)` ¶

`period` ¶

`save_into` ¶

`cached_data` ¶

`kwargs`** ¶

`save(save_into=None, **kwargs)` ¶

`save_into` ¶

`kwargs`** ¶