Time series and TMY data

In this document

We will explain how information from the multi-year time series can be summarized into a Typical Meteorological Year (TMY). TMY datasets were designed to be used as inputs in PV simulation software for quick simulations or for those engines that are limited to the computation of 8760 values.

Overview

Time Series represents the original data product generated by Solargis, encompassing sub-hourly data values for the entire multi-year recorded period and all available solar, meteorological, and environmental parameters crucial for accurate energy modeling and site-specific performance optimization. Typical Meteorological Year (TMY) datasets condense multi-year time series into a single representative year, ideal for quick PV simulations or tools limited to 8760 hourly values.

The Solargis method generates TMY by selecting representative months based on statistical alignment and similarity of cumulative distribution functions (CDFs). Parameters are weighted according to the application, and only complete years are used to ensure reliability.

Probabilistic TMY scenarios, such as P90 or P75, represent different risk levels, with P90 offering a conservative estimate reflecting low annual irradiation. These are created through iterative refinement and division of data into smaller chunks, ensuring accuracy for applications requiring confidence in solar energy predictions.

Graphic representation of GHI, DNI and TEMP series for a short period of 4 days for a sample location

Parameters included on time series

A solar time series is a dataset containing critical parameters that influence the amount of solar energy received by a power plant, along with the operating conditions that affect performance and losses at a specific location. These time series provide insights for energy modeling and optimization over the available historical period.

By incorporating these parameters into solar time series data, energy simulations can account for site-specific conditions, enabling accurate performance analysis and optimization of PV systems. The parameters included in the time series dataset are detailed below:

  • Global Horizontal Irradiation/Irradiance (GHI): GHI represents the total solar radiation, comprising both direct and diffuse components, received by a horizontal surface. As a reference metric, GHI is commonly used to compare site potential. Alongside DNI, it is essential for calculating Global Tilted Irradiance (GTI), the solar radiation received by the plane of PV modules.

  • Direct Normal Irradiation/Irradiance (DNI): DNI measures the direct solar radiation received perpendicularly on a sun-tracking surface from the solar disk and its surrounding circumsolar disk (5° radius). DNI is crucial for Concentrating Solar Power (CSP) and Concentrating Photovoltaic (CPV) technologies. For standard PV systems, DNI contributes to GTI calculations, particularly for systems with solar trackers.

  • Diffuse horizontal Irradiation/Irradiance (DIF): DIF is part of solar irradiance that is scattered by the atmosphere. Higher values of diffuse horizontal irradiance/irradiation (DIF), compared to global horizontal irradiance/irradiation (GHI), indicate higher occurrence of clouds, aerosols (pollution by atmospheric particles) or higher precipitable water vapor. Ratio DIF/GHI (D2G) indicates the portion of diffuse component to global horizontal irradiation.

  • Air Temperature (TEMP): TEMP measured at 2 meters above ground significantly impacts PV system efficiency, influencing module performance and operational environment. It is a key input in energy simulation models.

  • Dew point temperature (TD): TD is the temperature at which air becomes saturated with moisture and dew begins to form. TD is used to assess the risk of condensation, accelerated corrosion, and degradation of PV components. The combination of TD with air temperature and humidity gives a better picture of the thermal conditions of PV modules.

  • Wet bulb temperature (WBT): WBT is the lowest temperature achieved by evaporative cooling of a water-soaked surface. In areas with low wet bulb temperatures, ambient cooling might enhance the performance of PV modules due to the lower operating temperature. High wet bulb temperatures can signify high humidity. The combination of WBT with air temperature and humidity gives a better picture of the thermal conditions of PV modules. It is particularly important in humid regions, where high air moisture reduces cooling efficiency and leads to overheating.

  • Relative Humidity (RH): RH reflects the moisture content in the air and affects PV system soiling rates. Derived from specific humidity, atmospheric pressure, and air temperature, RH data carries higher uncertainty due to its indirect calculation. Measurements are taken at 2 meters above ground.

  • Wind Speed  (WS): Wind helps cool PV modules, enhancing performance, but strong winds pose risks to module stability and mounting structures. Wind data, particularly in complex terrains, is vital for determining safe and efficient system orientation.

  • Wind Direction (WD): Wind helps cool PV modules, enhancing performance, but strong winds pose risks to module stability and mounting structures. Wind data, particularly in complex terrains, is vital for determining safe and efficient system orientation.

  • Wind gust (WG): WG is a short, intense burst that significantly exceeds the average wind speed over a short period of time (typically less than 20 seconds). Wind gust can impact the design, stability, performance, and operation of PV systems, especially those that include large PV modules and solar trackers.

  • Atmospheric Pressure (AP): While AP has minimal direct impact on PV systems, it is used to adjust air mass for spectral mismatch correction in energy simulations. Abrupt pressure changes can influence structural components, especially in high-altitude or variable-pressure regions.

  • Precipitation (PREC): Rainfall serves as a natural cleaning mechanism for PV modules, though low-intensity rainfall can worsen soiling by making dust stickier. Modules and structures must be designed to endure heavy rain and flash floods.

  • Precipitable Water (PWAT): PWAT quantifies the total water content in an atmospheric column, influencing solar radiation by absorption and scattering. It is a critical parameter for spectral corrections, affecting light quality and module performance, especially for different semiconductor materials.

  • Water equivalent of accumulated snow depth (SDWE). SDWE is the amount of water in a specific snow depth, expressed as the liquid depth if fully melted. It helps estimate snow losses in PV systems but is challenging due to limited measurements, complex snow-PV interactions, varying conditions, and highly variable snow events across seasons and years.

Additional parameters, such as Ground Surface Albedo (ALB), are included as monthly averages for subsequent calculations but are not integrated into the time series alongside other parameters. Albedo, which measures the reflectivity of the ground, plays a critical role in determining the amount of reflected sunlight available for bifacial PV modules. Including albedo in energy models enhances the accuracy of performance predictions, particularly for bifacial systems, by accounting for the impact of reflected light on overall energy generation.

Quality flags

One of the key benefits of Solargis time series data is that the data have no gaps. However, sometimes there are gaps in the archive of satellite images that are used as input in the Solargis satellite-based model. For time stamps with missing satellite data, we apply intelligent statistical algorithms to deliver time series datasets without any gaps. Techniques used for gap-filling depend on the time of day and how much data is missing.

Each irradiance data value in the Solargis time series dataset is accompanied by a quality flag (FlagR) that indicates how the cloud information was derived from the satellite data. An explanation of the irradiance quality flags used by Solargis is given below:

Solar irradiance flags included in Time Series datasets

Flag

Description

Sun below the horizon

No cloud information is calculated as there is no solar radiation.

Model value

Data come from the satellite model, and no data are missing.

Interpolated <=1 hour

Few time slots are missing and these are interpolated from surrounding hours.

Extrapolated <=1 hour

Few time slots are missing at the beginning or end of the day, cloud information is extrapolated from the closest available value.

Interpolated/extrapolated >1 hour

The same as 2 and 3, but the period of missing data is longer than 1 hour.

Long-term monthly median or persistence

Big gaps in the data (e.g. whole day) are replaced by data from the previous day.

Synthetic data

The same as 5, but data are replaced by synthetically generated data.

Solargis method for TMY generation

The Solargis Typical Meteorological Year (TMY) P50 is created by selecting the most representative months from a historical time series (e.g., the most typical January, February, March, etc.) and combining them into a single artificial year. This method ensures the TMY reflects long-term typical conditions while maintaining statistical accuracy and representativeness.

The selection of representative months is based on two primary criteria:

  • Statistical characteristics alignment: The TMY aims to minimize the difference between its statistical characteristics (e.g., annual and monthly averages) and those of the original time series. This criterion carries approximately 80% of the weighting in the selection process.

  • Cumulative Distribution Function (CDF) similarity: The TMY ensures that the occurrence of typical hourly values for each month closely aligns with the distribution observed in the original time series. This criterion accounts for the remaining 20% of the weighting.

The weighting of each parameter used to select the typical months is customized based on the type of solar energy application. For instance, different weights may be assigned to direct normal irradiance (DNI), global horizontal irradiance (GHI), diffuse irradiance (DIF), and air temperature (TEMP) at 2 meters, depending on the specific requirements of the analysis. Other meteorological parameters may also be included but typically carry lower accuracy and relevance and thus do not influence the choice of representative months.

When generating TMY data, we try to select months in such a way that the annual sum of GHI/DNI values in the TMY file is consistent with the annual average calculated from the time series. However, it may not be possible to find representative months where the sum of irradiation as well as meteo values will equal the long-term average. Therefore, we may be required to slightly adjust the meteo values to maintain similar averages as calculated from time series.

To ensure data consistency and reliability, only complete years from the historical time series are used in the construction of the TMY.

Creating probability scenarios for TMY

In general practice, various datasets can be derived from the time series:

  • The TMY P50 data set represents, for each month, the average climate conditions and the most representative cumulative distribution function, therefore extreme situations (e.g. extremely cloudy weather) are not represented in this dataset.

  • The TMY P90 data set represents for each month the climate conditions, which after summarization of irradiation values for the whole year, result in a value close to P90 value derived by statistical analysis of uncertainties and interannual variability. Thus TMY for P90 represents generally a conservative estimate, i.e. a year with the annual irradiation that is close to the lowest identified within the time series. Similarly, other scenarios such as P75, P99, or any PXX can be applied.

The method for creating TMY P90 (or other scenarios such as P75, P99, or any PXX) builds upon the TMY P50 methodology but introduces modifications to how candidate months are selected:

The P90 annual values are derived from the combined uncertainty, which accounts for both the generic performance of the model and the interannual variability driven by local weather conditions. The resulting P90 annual value will be the reference for the selection of candidate months.

Instead of focusing on minimizing differences in monthly means and cumulative distribution functions (CDFs), as in the P50 method, the P90 process prioritizes aligning the annual P90 value with the annual sum of the newly constructed TMY. The selection process is iterative, repeatedly refining the set of twelve candidate months until the difference between the annual P90 value and the TMY’s annual sum is minimized. Once the selection criteria are met, the TMY is created by concatenating the chosen months.

To improve the quality of the candidate months and better reflect the probability scenario, the original dataset is divided into smaller 15-day chunks. These smaller segments are recombined in various ways to generate multiple candidate months, increasing the likelihood of finding months that best satisfy the P90 conditions. This approach ensures a more accurate representation of the desired probability scenario, making the resulting TMY suitable for applications requiring specific confidence levels in solar energy predictions.

Multi-year time series vs TMY data

Time Series is the original data product generated by Solargis model, the one which contains the data values for the whole registered period and all the available data parameters. Since the information contained by a Time Series file is not compressed, it is the most versatile file able to provide inputs for all the calculations needed to run a complete solar resource assessment.

Solargis multi-year time series is most typically used for the following purposes:

  • To understand the seasonal and inter-annual variability of solar resource

  • To understand the occurrence of extreme irradiance and temperature events - for design optimization of solar power systems

  • Input data for energy simulation of solar power systems

  • For accuracy enhancement of long-term satellite-derived irradiance estimates - when high-quality irradiation measurements are available at the project site

The Typical Meteorological Year (TMY) is a popular data product designed for summarizing the average weather conditions of a specific site in a period of a single year. TMY data is primarily used for energy simulation purposes, as popular simulation software such as PVsyst, SAM, etc. typically work with 8760 hourly values representing a typical year. The main reason for the popularity of the TMY dataset for solar energy simulation is the compatibility of such data with popular energy simulation software and the speed of simulation.

The TMY is indeed constructed from Time Series. However, since TMY data results in an avoidable loss of information, it is recommended to use a full time series file when possible.

Main differences between Time Series and TMY

Data description

Time Series

TMY

Period

Data from the full period available since 1994/1999/2007 (depending on the region)

Data from a summary year constructed by concatenation of data from typical months

Data values

Up to 876,000 approx.

8,760

Data time step

15-minute / Hourly

Hourly

Spatial resolution

250 meters

250 meters

Recommended uses of Time Series and TMY datasets

Applications

Time Series

TMY

Running energy simulations

Yes

Yes

Calculation of absolute max / min values

Yes

No

Calculation of interannual variability

Yes

No

Comparison of data sources

Yes

No

Adjustment of values using ground data

Yes

No

Differences with pre-calculated monthly averages

Unlike Solargis Time Series or TMY data, which are retrieved in real-time when an order is made, pre-calculated averages database is updated every year. Because of that, there are some differences in the data used for each calculation. Besides, to achieve higher speed across the calculation process, the pre-calculated averages go through a slightly different process.

In summary, the key differences between the two data products are based on the following factors:

  • Irradiance Model. Time Series and TMY use the latest model available at the time of order, while pre-calculated averages use the model available at the last database update.

  • Period of Years. Time Series and TMY use the most recent data available at the time of order, whereas pre-calculated averages the period of data available at the last database update.

  • Calculation approach. Time Series and TMY data utilize bilinear interpolation during the reading process, with gap filling applied on night-time periods based on solar position.

As a result of this, the difference in annual averages of Global Horizontal Irradiance (GHI) calculated from Time Series or TMY and data pre-calculated averages database can be up to ±1%.