Evaluating forecast accuracy

In this document

This document describes best practices for evaluating solar forecast accuracy, including guidance on selecting appropriate reference data, applying quality control, choosing suitable time periods and data resolutions, and handling forecast horizons. It is intended for analysts and engineers working with PV power output and solar radiation forecasts.

Usage in Solargis platform
This approach is used in Solargis Monitor and Solargis Forecast.

Overview

Accurate solar forecasting is fundamental to efficient energy grid management and optimal utilization of solar energy resources. Forecast accuracy is evaluated by comparing forecast values against reference values, and the quality of that comparison depends on multiple methodological choices made before any accuracy metric is calculated.

This document covers the full evaluation workflow:

Selecting the right parameter for comparison (PV power output vs. GHI/GTI).
Understanding the limitations of both measured and satellite-derived reference data.
Applying quality control to the reference dataset.
Selecting an appropriate evaluation time period and data resolution.
Handling forecast horizons correctly.
Computing forecast deviation as the basis for accuracy metrics.

Note: Careful preparation of reference data is the most critical — and most frequently overlooked — step in forecast accuracy evaluation. Errors in reference data directly compromise the validity of any accuracy metric derived from it.

Parameters for forecast accuracy evaluation

Several parameters can be used for forecast accuracy evaluation. The most common is PV power output, but Global Horizontal Irradiation (GHI) and Global Tilted Irradiation (GTI) can be used as well. Accuracy evaluation based on each of these parameters has potential issues; however, in the majority of cases, using PV power output is recommended over GHI or GTI.

Issues with PV power output

The actual measured PV power output is obtained from sensors or meters installed at the PV system site. These devices continuously monitor real-time PV power generation and provide data on the actual amount of electricity generated over the same time period as the forecast.

When PV power output is used as a reference in forecast accuracy evaluation, the following potential issues with data quality and reliability should be considered:

Inverter outages: Malfunctioning of inverters often results in sudden drops in PV power output values. Such data typically reduces the reliability of forecast accuracy evaluation results.
Curtailment: Curtailment refers to the limitation of power generation when there is excess electricity in the grid. PV power output values influenced by curtailment should be excluded from forecast accuracy evaluation, because they do not provide a reliable reference for forecast PV power output.
Logger issues: Malfunctioning of devices logging PV power output often results in illogical values, reducing the quality of the reference data. Such data points should be excluded from the forecast accuracy evaluation process.
Natural phenomena: Events such as heavy snow or foggy conditions significantly reduce the amount of solar radiation reaching the panels and, consequently, PV power output. Although these values cannot be considered corrupted, it should be carefully considered whether to include them in the forecast accuracy evaluation, as forecast models find it challenging to accurately predict these events and they may significantly impact the evaluation results.

Figure 1: Curtailed PV power output (green) compared to forecast PV power output (red) on January 23, 2021.

Figure 2: Reference PV power output (green) limited by snow cover, compared to forecast PV power output (red), December 9–11, 2020.

Issues with GHI/GTI (ground measurements)

GHI is measured by pyranometers and GTI is measured with pyrheliometers. Both instrument types are sensitive and must be maintained properly to provide accurate and reliable measurements. Without proper maintenance, these instruments often produce reference data that compromises forecast accuracy evaluation results.

The following maintenance-related issues should be considered:

Regular calibration: Regular calibration of measuring instruments according to manufacturer guidelines is essential for high-accuracy, low-uncertainty measurement outputs. If calibration is omitted or the interval between calibrations is too long, instruments typically provide erroneous data.
Regular cleaning: When mud, dust, bird droppings, or other dirt accumulate on an instrument sensor, it blocks solar radiation and leads to lower irradiance readings. Accumulated dirt can also cause gradual sensor degradation, resulting in long-term measurement drift.
Proper alignment: When a measurement instrument is properly aligned, it receives solar radiation without obstruction from surrounding objects such as trees, buildings, or other structures. Shaded instruments produce underestimated solar radiation readings and such data should not be used during forecast accuracy evaluation.

Figure 3: Effect of GHI instrument soiling — visible as a gradual reduction in measured values relative to a clean-sensor reference.

Figure 4: Systematic shading effect seen in measured GHI and DNI — recurring drops at consistent times indicate obstruction by a nearby object.

Selecting the right parameter

When evaluating forecast accuracy, having high-quality reference data is crucial. Both measured PV power output and ground measurements (GHI/GTI) can be used for forecast accuracy evaluation. PV power output is usually preferred, because the operational performance of a PV power plant is ultimately what matters most. On the other hand, when the quality of ground measurements is higher than that of the measured PV power output, it may make sense to use GHI/GTI instead.

Types of reference data (measured vs. satellite-derived)

Two types of data can be used as a reference when evaluating forecast accuracy.

Measured data is obtained by measuring instruments located at a PV system site. Potential issues with measured reference data are described in the previous section.

Satellite-derived data refers to solar radiation estimates obtained from satellite imagery. Satellite imagery captures atmospheric and cloud conditions to estimate GHI at the Earth's surface. Estimated GHI can then be transformed into GTI and PV power output. Satellite-derived data is gap-free and error-free, which is a significant advantage over measured data.

However, satellite-derived data is produced by an algorithm, and no algorithm is perfectly accurate. Satellite-derived solar radiation estimates may be affected by at least the following:

Errors due to misclassification: For example, thin clouds tend to be underestimated, and bright surfaces such as snow-covered regions, water bodies, and deserts can be overestimated.
Spatial resolution limitations: Localized variations such as cloud shadows, terrain effects, and local microclimates may not be well captured.
Difficulty in capturing local effects: For example, aerosol, dust, and pollution levels, which significantly affect solar radiation reaching the Earth's surface, may not be accurately represented.

Both measured and satellite-derived reference data have strengths and weaknesses. If high-quality measured data is available, it is recommended to use it when evaluating forecast accuracy, because it represents ground truth most directly.

Quality control

High-quality reference data is the most critical aspect of forecast accuracy evaluation. Quality control ensures that the reference data is reliable and consistent before any metrics are calculated.

For PV power output, quality control should identify:

Suspicious outliers (e.g., negative values, values exceeding the installed capacity of the PV system).
Missing values.
Sensor and inverter malfunctions.
Data logging errors.

For GHI/GTI ground measurements, quality control should identify:

Misalignment of measuring instruments.
Shading by surrounding objects.
Soiling of measuring instruments (e.g., dust, snow, water droplets, frost, bird droppings).

Identified erroneous data should be either removed from the evaluation dataset or replaced with satellite-derived counterparts. The primary goal of quality control is to prepare the reference data so that the forecast accuracy evaluation results reflect reality as closely as possible.

Figure 5: Example of a partial drop in PV power output caused by an inverter issue — the affected period is highlighted.

Figure 6: Example of a data logger issue visible in PVOUT values — illogical spikes and gaps indicate logging errors.

Recommended time period

Forecast accuracy can be evaluated for time periods of different lengths — from a single hour to a full year or longer. Ultimately, forecast accuracy evaluation should provide results that cover diverse weather conditions throughout the year.

It is recommended to use the most recent 12 months of data to obtain comprehensive forecast accuracy evaluation statistics. The following considerations support this recommendation:

Analysis of seasonal trends: Evaluating forecast accuracy across different seasons (e.g., spring, summer, autumn, winter, or rainy vs. dry season) captures seasonal variations in PV power output and solar radiation prediction. Seasonal variations typically impact forecast accuracy, and evaluating a full year assesses how well forecasts align with actual reference data across all seasons.
Variability assessment: PV power output and solar radiation can vary significantly based on weather patterns, seasons, and time of day. A longer time horizon allows an assessment of how well forecast models capture this variability across diverse conditions.
Longer-term performance: Short-term variations or anomalies in weather patterns may occur and can skew forecast accuracy statistics over a period of days or weeks. Evaluating forecast accuracy over a full year provides a more comprehensive view of forecast model capabilities.

Forecast accuracy evaluation for a shorter time period may not provide a complete understanding of forecast performance. Evaluating the most recent 12 months of data provides better insight into forecast accuracy across various weather conditions.

Recommended data resolution

When evaluating forecast accuracy, the reference data must be resampled (aggregated) to a suitable temporal resolution. While 1-minute or 5-minute measurements of PV power output and GHI/GTI provide high-resolution data, Numerical Weather Prediction (NWP) models may not be able to capture such short-term variability accurately.

The main reasons for this limitation are:

Temporal resolution: NWP models typically operate at coarser temporal resolutions, often providing forecasts at hourly or less frequent intervals. These models are designed to capture larger-scale weather patterns and trends rather than short-term fluctuations.
Spatial resolution: NWP models operate at certain spatial resolutions and may not capture localized variations in solar radiation that occur over short time intervals.
Model limitations: The physics-based equations used in NWP models have inherent limitations in representing microscale atmospheric processes that influence short-term solar radiation variability, such as cloud formation and dissipation, turbulence, and local terrain effects.

Figure 7: Difference in variability between forecast data (red) and 5-minute reference data (green) on December 22, 2024. High-frequency variability in the reference is not captured by the forecast.

Using 1-minute or 5-minute measurements as a reference for evaluating forecast accuracy may introduce discrepancies due to the inherent differences in temporal resolution between the reference and the forecast.

Forecast accuracy evaluation is about understanding the limitations and characteristics of both data sources. Therefore, it is recommended to aggregate 1-minute or 5-minute PV power output or GHI/GTI measurements to match the temporal resolution of the forecast data (e.g., 15-minute, hourly) before comparing them. This ensures a more meaningful and fair comparison.

Forecast horizon

The forecast horizon refers to the time period into the future for which predictions are made. Handling the forecast horizon correctly is essential for reliable accuracy evaluation.

The difference in operational and historical forecasts are explained in the Operation and historical forecasts document.

Historical forecasts

In the case of historical forecasts, evaluating accuracy is straightforward. Each file with historical forecasts always contains predictions for a single, consistent forecast horizon.

Available historical forecast horizons

H0 ("intra-hour")
H1 ("hour-ahead")
H2 ("two hours-ahead")
D0 ("intra-day")
D1 ("day-ahead")
D2 ("two days-ahead")
D3 ("three days-ahead")

Operational forecasts

In the case of operational forecasts, the situation is different. A single file with operational forecasts may contain predictions ranging from H0 up to D14, meaning that a single file contains multiple forecast horizons mixed together.

It is not recommended to evaluate different forecast horizons together. The longer the forecast horizon, the higher the uncertainty of the predictions, so mixing horizons in a single evaluation produces misleading results. Instead, the forecast horizons of interest should be filtered from the operational forecast files and aggregated into a single time series before the forecast accuracy evaluation is performed.

Assessing a consistent forecast horizon ensures reliable and comparable accuracy evaluation results.

Forecast accuracy evaluation metrics

Forecast deviation, also referred to as forecast error, is the numerical difference between the forecasted value (PV power output or GHI/GTI) and the reference value. It indicates how much the forecast deviates from the reference and is typically expressed as an absolute value or as a percentage.

The forecast deviation for a single time step is calculated as:

A positive deviation indicates that the forecast overestimates the reference; a negative deviation indicates underestimation.

Table 1: Base inputs for forecast accuracy evaluation — example of forecast deviation calculation

Date	Time UTC+0	Forecast [kWh]	Reference [kWh]	Forecast deviation [kWh]
22.02.2024	8:00	400	383	17
22.02.2024	9:00	456	471	-15
22.02.2024	10:00	564	610	-46
22.02.2024	11:00	753	636	117
22.02.2024	12:00	673	526	147
22.02.2024	13:00	593	663	-70
22.02.2024	14:00	498	623	-125
22.02.2024	15:00	489	467	22

Calculating forecast deviation is the first step in the forecast accuracy evaluation process.

Evaluating forecast accuracy - best practices

Overview