Data quality control & analysis

In this document

You will understand the importance and necessity of data quality control (QC) in Solar industry. We will also introduce the Solargis approach and outline our process for data quality control.

Importance of data quality control

In the solar energy industry, the quality control (QC) of ground measurements and PV data is a critical process that ensures the reliability of data used for site adaptation and validation of satellite models. As solar energy systems become increasingly sophisticated, the need for accurate and dependable data has never been more essential. Quality control serves as a necessary step in data analysis, as faulty readings can significantly affect final results, leading to suboptimal system performance and misguided decision-making.

The complexity of solar energy data necessitates a thorough quality control process across three main groups of measurements:

  • Solar measurements from instruments like pyranometers and pyrheliometers.

  • Meteorological measurements, including thermometers, humidity sensors, and anemometers.

  • Photovoltaic (PV) data such as PV yield or inverter readings.

Given the variety of potential issues—ranging from instrument-specific problems to site-specific challenges—it is common for measurement datasets to encounter discrepancies. Therefore, implementing QC protocols for all datasets is crucial to maintaining data integrity.

Examples of issues that may arise include hardware-related problems such as time shifts, drifts, calibration errors, instrument misalignment, and tracker malfunctions. Additionally, natural factors like shading,snow accumulation, and soiling on instruments can further compromise measurement accuracy.

Data that does not pass quality control must be excluded from further processing to ensure that analyses are based on reliable information. Allowing flawed data to be included can lead to erroneous conclusions and ineffective strategies, ultimately undermining the goals of solar energy projects. By emphasizing rigorous quality control practices, stakeholders can enhance the reliability of their data and contribute to a more sustainable energy future.

Best practices for data collection

Before implementing data quality control measures, it is essential to collect data in a specific manner to ensure its accuracy and reliability:

Ground measurements (solar) QC best practices

  • Comprehensive station metadata: The data you have access to should include detailed station metadata, encompassing the location, installed instruments, type of mounting, full GTI configuration, datasheets for all hardware components, and high-resolution data ideally recorded at one-minute intervals, ensuring you receive the most accurate information available.

  • High-quality instruments with redundancy: The data should be recorded by Class-A instruments, which ensure high-quality measurements that meet industry standards for precision, along with built-in redundancy in the instrument setup to enhance reliability and minimize the risk of data loss.

  • Regular maintenance and cleaning schedule: Frequent maintenance and cleaning practices are essential to ensure optimal performance and accuracy of the instruments. Instruments should be cleaned twice a week or bi-weekly, depending on site conditions, to keep them free from debris and contaminants.

  • Instrument calibration: Calibration should be performed every two years or as recommended by the manufacturer, with calibration certificates readily available for verification, and the maintenance events should be recorded for transparency.

  • Instrument installation: Instruments should be installed at least 1.5 meters above ground level to minimize interference from environmental factors and enhance measurement accuracy. They should also be equipped with ventilator units to maintain optimal operating conditions and prevent overheating.

Ground measurements (meteorological) QC best practices

  • Comprehensive station metadata: Station metadata should include essential information such as the location, installed instruments, and datasheets for all hardware components, ensuring you have a complete understanding of the system.

  • Resolution and uncertainty: The data resolution should ideally follow the same steps as solar data (1-minute, 10-minute, etc.), with sufficient uncertainty levels established to enhance data reliability and accuracy.

  • Maintenance and cleaning practices: Regular maintenance is crucial, primarily involving visual inspections of the instruments. Cleaning should occur once a week or bi-weekly, depending on site conditions, to ensure optimal performance.

  • Calibration standards: Calibration should be updated according to the manufacturer's recommendations, with calibration certificates readily available for verification.

  • Optimal instrument installation: Instruments should be installed at heights of 1.25 to 2 meters above ground level for temperature and relative humidity measurements and at 10 meters above ground level for wind direction, speed, and gust measurements. Additionally, instruments must be placed in open areas, avoiding proximity to buildings or other structures to minimize interference.

  • Representative measurement locations: For solar field measurement sites, it is essential to position instruments within the solar field itself rather than at the edges, as this provides more representative data for solar energy assessment.

PV data QC best practices

Power Plant Metadata

  • Comprehensive plant description: The power plant metadata should offer a complete overview of the PV plants, including components, layouts, locations, environmental conditions, and installation/commissioning dates.

  • Location and electrical connection details: The metadata should include specific geographical information about the site, along with a detailed layout of the electrical connections, specifying the inverters and transformers used in the system.

  • Installed power and module specifications: The documentation should have information on the total installed power capacity and types of photovoltaic modules. It should also include details about the mounting type, including hardware structures, layout of the PV module field, row configurations, pitch, and limits for 1-axis trackers if applicable.

  • Module orientation and configuration: The metadata should provide information on the clearance, tilt, and azimuth angles of the PV modules to ensure optimal solar exposure.

  • Datasheets and supplementary materials: The documentation should have datasheets for all hardware components available and include additional materials such as photos of the site or power plant, CAD drawings of mounting structures or floaters, and topographical maps.

Time Series of PV Power Production Data

  • Data resolution and sources: The time series data should have a minimum resolution of 15 minutes for PV power production, derived from SCADA or a monitoring system. This data should be accessible at both the inverter level and at the point of interconnection (electrometer).

  • Irradiation data: The Time Series data should include records for in-plane irradiation (Global Tilted Irradiance - GTI) to effectively correlate solar input with power output.

Best practices for data quality control

Quality control best practices

After collecting data according to established best practices, the next stage involves implementing quality control (QC) measures to ensure the data’s integrity and reliability:

  • Time reference check (TRC): Aligning time stamps with a universal time reference, such as UTC, involves understanding the original time reference system and applying necessary adjustments for local time or daylight saving changes.

  • Automatic QC procedures: Conducting systematic checks to identify potential issues in solar measurement data includes verifying metadata configurations, filtering out invalid signals, and detecting advanced problems related to shading or equipment malfunctions.

  • Visual quality control procedures: Performing manual reviews to complement automatic checks involves analyzing results from these processes to identify subtle problems through visual inspection and comparing measured data with reference parameters to spot anomalies.


The Solargis approach

At Solargis, we put strong emphasis on data quality control by implementing comprehensive and rigorous procedures that include reviewing the existing device documentation, error identification, and statistical analysis to ensure the integrity and reliability of solar and meteorological measurements across thousands of locations globally.

The three main areas where we perform quality control at Solargis are solar measurements, meteorological parameters, and photovoltaic (PV) data.

Each of these areas is further divided into time reference checks, automatic QC procedures, and visual QC procedures.

Solar measurements QC checks

By implementing the following QC checks, we ensure the integrity and reliability of our solar irradiance data, ultimately supporting accurate assessments and optimization of solar energy systems further down the line.

Relevant radiation parameters

Description

GHI (Global Horizontal Irradiance)

The total amount of solar radiation received on a horizontal surface.

DNI (Direct Normal Irradiance)

The amount of solar radiation received by a surface that is always held perpendicular to the sun's rays

DIF (Diffuse Irradiance)

The portion of solar radiation that has been scattered by molecules and particles in the atmosphere and arrives at a surface from all directions

ALB (Surface Albedo)*

The reflectivity of a surface, indicating how much solar radiation is reflected back into the atmosphere

RHI (Reflected Horizontal Irradiance)

The amount of solar radiation reflected off surrounding surfaces onto a horizontal plane

GTI (Global Tilted Irradiance)

The total solar radiation received on a tilted surface, which is crucial for optimizing solar panel placement

Table: Solar parameters relevant to the solar measurement QC checks.
* The surface albedo is an environmental parameter; we have included it here for simplicity and a better understanding of the QC procedure.

Time reference check

Time reference check (TRC) is carried out to ensure that the time stamps of ground measurement data are accurately aligned with the time reference used by Solargis (UTC). Key aspects of TRC include:

  • Understanding the original time reference system used in data collection and converting local time to UTC.

  • Applying necessary time shifts when data is measured in local time or when daylight saving time is observed, which requires adjustments at specific times of the year.

  • Detecting any additional time shifts, and drifts, whether gradual or abrupt, all conducted using an in-house developed automatic tool.

  • Confirming applied shifts manually through specialized plots to ensure accuracy.

  • Recognizing that accurate time stamps are crucial for subsequent QC checks, as incorrect timestamps can lead to misidentification of valid data.

Automatic QC procedures

Automatic QC procedures involve several systematic checks designed to identify and flag potential issues in solar measurement data. These include:

  • Metadata checks: Verifying GTI configuration and ensuring correct parameter types are utilized.

  • Data checks: Implementing physical limits to filter out unphysical signals (e.g., static or invalid readings) and ensuring consistency between different components and instruments.

  • Advanced issue detection: Identifying problems related to shading, dew formation, tracker malfunctions, stowing events, or maintenance activities.

Visual QC procedures

Visual quality control procedures complement automatic checks by providing a manual review process. These procedures involve:

  • Reviewing the results from automatic QC processes and flagging any remaining issues for further investigation.

  • Identifying subtle problems such as soiling, misalignment, or calibration issues through visual inspection.

  • Analyzing measured data using daily Time Series profiles, heatmaps or scatter plots to detect anomalies over time.

  • Comparing measured data with reference parameters—whether from redundant measurements, calculated values, or model parameters—facilitating easier identification of issues within the dataset.

  • By implementing these comprehensive QC checks for solar measurements, Solargis ensures the integrity and reliability of its solar irradiance data, ultimately supporting accurate assessments and optimization of solar energy systems.

Meteorological measurements QC checks

Our main areas of focus for quality control (QC) in meteorological measurements are crucial for ensuring the accuracy and reliability of data prepared for further processing. We prioritize rigorous checks on the most common meteorological parameters, conduct thorough time reference check, implement automatic QC procedures, and perform detailed visual inspections.

Relevant meteorological parameters

Description

TEMP (Air Temperature):

The measure of ambient air temperature, which affects solar panel performance.

RH (Relative Humidity):

The amount of moisture in the air, influencing both energy production and equipment efficiency.

WS (Wind Speed)

The speed of wind, which can impact cooling and overall system performance.

WG (Wind Gust Speed)

The maximum speed of wind gusts, important for assessing potential stress on solar installations.

WD (Wind Direction)

The direction from which the wind is blowing, relevant for understanding environmental conditions.

AP (Atmospheric Pressure)

The pressure exerted by the atmosphere, which can affect weather patterns and system performance.

Table: Meteorological parameters relevant to the meteorological measurement QC checks.

Time reference check

For meteorological data, we first apply a time reference shift correction, similar to our approach for solar measurements. This correction is assessed either through solar radiation parameters or by comparing with Solargis data to ensure accurate alignment with UTC.

Automatic QC procedures

Our automatic quality control procedures involve systematic checks designed to identify and flag potential issues in meteorological data. These procedures include:

  • Metadata checks: Verifying the correct data unit has been provided to ensure consistency across datasets.

  • Data checks: Implementing physical limits to filter out unphysical signals, such as static or invalid readings.

  • Advanced issue detection: Identifying outliers and invalid patterns that indicate measurement errors.

Visual QC procedures

Our visual quality control procedures complement automatic checks by providing a manual review process. These procedures involve:

  • Reviewing the results from automatic QC processes and flagging any remaining issues.

  • Conducting visual inspections similar to those performed for solar parameters to identify subtle problems in the data.

PV data QC checks

Commitment to quality control (QC) for photovoltaic (PV) output data is another area we at Solargis focus on. We understand the importance of high-quality PV data and what it means for solar project developers.

PV OUT parameters

Description

PVOUT

The overall power output from the solar installation.

Inverter data

Information related to the performance and operational status of inverters, which are critical for energy conversion.

Table: Parameters relevant to the PV data QC checks.

Time reference check

For PV output data, we utilize an automatic tool developed in-house to detect time shifts and drifts, similar to the approach used for solar parameters. We confirm the applied shifts manually using specialized plots to ensure accuracy. Correct time stamps are essential, as inaccuracies can lead to valid data being misidentified as incorrect.

Automatic QC procedures

Our automatic quality control procedures encompass a range of systematic checks designed to maintain high data quality. These include:

  • Metadata checks: Verifying the configuration of the PV plant to ensure all parameters are correctly set.

  • Data checks: Implementing physical limits to filter out unphysical signals, such as static or invalid readings, and identifying general underproduction issues.

  • Advanced issue detection: Monitoring for conditions such as shading, snow accumulation, soiling, clipping, and tracker malfunctions.

Visual QC procedures

To complement our automatic checks, we conduct visual quality control procedures that involve:

  • Reviewing the results from automatic QC processes and flagging any remaining issues.

  • Identifying subtle problems such as soiling by analyzing measured data using daily Time Series profiles, heatmaps or scatter plots to detect anomalies over time.

  • Comparing generated data against reference parameters—whether from calculated values or PV simulation—to facilitate easier identification of issues within the dataset. This comparison helps distinguish between failures (e.g., soiling) and features (e.g., clipping) of the power plant, guiding us in determining how to further process the dataset.