Validation of the soiling loss model

Prev Next

In this document

We will present a comprehensive validation of the Solargis soiling loss model, detailing its methodology, input data, evaluation framework, statistical results, and key findings based on comparisons with measured soiling data from 51 global sites.

Overview

Soiling, the accumulation of airborne particles on photovoltaic (PV) panels, is a complex phenomenon with significant spatial and temporal variability. It is one of the  major sources of uncertainty in PV performance modeling, making accurate soiling modeling essential for PV system planning and operation.

This document presents a detailed evaluation of the Solargis soiling loss model, assessing its performance and limitations through objective validation.

Geographical scope

Global

Time resolution

Daily values

Data parameters

Soiling loss

Calculated indicators

Bias, MAD, RMSD

Reference period

Usually ~1 year, dates variable, depending on the station

Reference source

Several publicly available datasets

Figure 1:  Map of sites used for validation of the soiling model.

Solargis soiling loss model overview

The Solargis soiling model employs a physics-based approach tailored for PV modules. It estimates the mass of airborne particles deposited on panels, empirically linking this to the soiling ratio (SR) and thus to soiling loss. The model operates daily and accounts for both natural (rain) and manual cleaning events.

Reference data and preprocessing

Ground-based observations: These are crucial for validating the model, as they provide direct measurements of soiling effects on PV performance. However, such data are limited due to the complexity and cost of soiling monitoring.

Challenges in reference data:

  • Different measuring methods: Various instruments and methodologies are used across stations, which can lead to inconsistencies in the measured soiling ratio even under similar conditions.

  • Varying time resolutions: Some sites record data at different intervals, complicating uniform quality control and comparison.

  • Missing site information: Incomplete metadata (e.g., module tilt, mounting type, cleaning schedules) can affect the interpretation of soiling data.

  • Limited historical data: While a year of data is usually enough to capture seasonal trends, longer-term patterns may remain undetected.

  • Limited geographic coverage: Despite the broad sample, some climates and configurations are underrepresented.

Data quality control: The reference data underwent expert review and preprocessing, including:

  • Flagging suspicious or erroneous entries.

  • Aggregating data to daily values for consistency.

  • Detecting and marking manual cleaning events.

  • Correcting measurement offsets to ensure comparability (see Figure 2 for an example of offset correction and cleaning event detection).

Figure 2:  Example of offset correction. Detected manual cleaning events are marked.

  • Global data sources: The model relies on global datasets, enabling application anywhere in the world. However, these sources have lower spatial resolution than the highly localized nature of soiling, introducing some uncertainty.

  • Particulate matter concentrations: PM2.5 and PM10 data are taken from MERRA-2 and CAMS EAC4 reanalysis models, harmonized to remove inconsistencies and provide a reliable input for soiling estimation (see Figure 3 for a global overview).

  • Rainfall: Measured rainfall data are used to minimize uncertainties from modeled precipitation, as rainfall is a key factor in natural cleaning of PV modules.

  • Additional weather parameters: Wind speed, temperature, relative humidity, and atmospheric pressure from ERA5 and ERA5Land reanalyses are included, as these factors can influence both soiling accumulation and cleaning.

  • PV module configuration: Site-specific details such as mounting type and tilt are sourced from measurements metadata, ensuring the model respects the actual installation parameters.

  • Data processing: All observational data are quality-controlled and aggregated to daily values, with additional filtering to identify and correct potential errors.

  • Local particle sources: Five sites with strong local emissions (e.g., from nearby industry or agriculture) are considered outliers. These emissions are not captured by global reanalysis models, so these sites are analyzed separately to avoid skewing the overall results.

  • Evaluation period: The duration varies by station, typically covering about one year, depending on data availability.

  • Accumulation period: The model considers the full 24-hour period for particle accumulation, including nighttime, to accurately reflect daily soiling dynamics.

  • Configuration and optimization: Multiple model configurations were tested. The final version was selected for its best overall performance, and different rainfall cleaning efficiencies were explored to optimize the model’s response to local conditions.

Figure 3:  Global overview of Solargis harmonized PM10 data.

Validation results

The Solargis soiling model was evaluated using data from 51 ground-based stations covering a wide range of climatic conditions. The validation focused on comparing the modeled soiling loss to measured values, using daily time resolution and several statistical metrics to assess accuracy and consistency.

Key findings:

  • The model generally shows good alignment with reference data, accurately tracking the average trend of soiling loss over time. However, performance varies by location, especially where strong local sources of airborne particulates exist that are not detected by global reanalysis models.

  • Atmospheric parameters, particularly the concentration of suspended particulate matter, have a significant influence on model performance. Rain plays a dual role, acting as both a cleaning and soiling agent, but wet deposition is not yet included in the current model version.

  • Additional factors such as wind speed and relative humidity may also affect soiling accumulation and cleaning, but are not fully accounted for in the model, potentially contributing to observed differences between modeled and measured data.

Number of validation sites

Soiling loss [%] bias

Soiling loss [%] stdev

51 (all sites)

-0.5%

1.7%

46 (without outliers)

0.1%

0.9%

The bias indicates the average deviation between the modeled and measured soiling loss. For all sites, the model slightly underestimates soiling loss by -0.5%. When outlier sites (those with strong local emissions) are excluded, the bias improves to 0.1%, indicating that there's virtually no overall systematic difference.

The standard deviation (stdev) reflects the spread of deviations. It is 1.7% across all sites, but drops to 0.9% when outliers are excluded, demonstrating higher consistency in locations without strong local particulate sources.

Temporal and spatial analysis

  • Figure 4: Shows the Time Series of modeled versus observed daily soiling loss. The model captures the average trend, though measured data often display greater variability due to local events and measurement noise.

  • Figure 5: Presents monthly averages of GTI-weighted soiling loss. The model tends to slightly overestimate losses in areas with low soiling accumulation and underestimate them in regions with higher accumulation. The five outlier sites, affected by strong local emissions, are clearly marked.

  • Figure 6: Illustrates the Time Series for one of the outlier sites, where measured soiling loss reaches extremely high values during the dry season before manual cleaning, indicating a strong local source of particulates not captured by the model’s input data.

  • Figure 7: An emissions map confirms high local particulate emissions at the outlier sites, supporting the observed discrepancies.

Figure 4:  Time Series of modeled versus observed soiling loss [%].

Figure 5:  Modeled versus observed soiling loss [%]. Observed values are plotted on the x-axis, and modeled values on the y-axis, with the 1:1 line (red dotted) indicating ideal agreement. The five outlier sites are highlighted within a red closed line.

Figure 6:  Time Series of modeled versus observed soiling loss [%] for one of the sites strongly affected by local emissions.

Figure 7:  Map of emissions highlighting the situation of the site of Figure 6.

Despite the inherent complexity and local variability of soiling, the Solargis model demonstrates strong consistency and reliability, effectively tracking daily soiling accumulation trends. Its physics-based design allows adaptation to site-specific conditions, though further improvements are needed to capture the effects of wet deposition, particle properties, and local emissions more accurately.

Conclusions

  • Soiling complexity: Soiling is highly variable both spatially and temporally, often changing dramatically even over small distances or short timescales. This makes accurate modeling essential for reliable PV system performance estimates.

  • Model validation: The Solargis soiling model has been rigorously validated against 51 ground-based stations worldwide, with extensive quality control to ensure the reliability of reference data. The model performs well overall, especially when local emission outliers are considered separately.

  • Model flexibility: The physics-based design of the model allows it to adapt to site-specific conditions and supports the development of tailored corrections for different environments and PV configurations.

  • Uncertainty factors: Key sources of uncertainty include the spatial resolution of input data, the dynamics of cleaning (especially rainfall and wind), and the quality and completeness of reference measurements. The limited number of sites means some climates and system types are not fully represented.

  • Research and future improvements: Ongoing research focuses on improving the modeling of rain and wind cleaning efficiencies, which are often constrained by the temporal resolution of available data. Future model versions will incorporate wet deposition processes and account for the physicochemical properties of particles, such as black carbon, organic carbon, sulfates, and pollen, which can significantly impact soiling and PV performance.