Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-frequency irradiance synthesis functions #788

Open
kevinsa5 opened this issue Oct 14, 2019 · 19 comments
Open

High-frequency irradiance synthesis functions #788

kevinsa5 opened this issue Oct 14, 2019 · 19 comments

Comments

@kevinsa5
Copy link
Contributor

kevinsa5 commented Oct 14, 2019

High-frequency pv simulations are useful in several contexts including grid impact studies and energy storage simulations. The scarcity of high-frequency irradiance datasets has spurred the development of many methods of synthesizing high-frequency irradiance signals from lower-frequency measurements (eg hourly satellite data). A couple examples:

"Sub-Hour Solar Data for Power System Modeling from Static Spatial Variability Analysis"
https://www.nrel.gov/docs/fy13osti/56204.pdf

"A stochastic downscaling approach for generating high-frequency solarirradiance scenarios"
http://amath.colorado.edu/faculty/kleiberw/papers/Zhang2018.pdf

These models often do not include a software implementation and are complex enough to present a significant barrier to entry for the reader. In that regard, they are similar to the decomposition/transposition irradiance functions included in pvlib. Implementing such a model in pvlib would increase its accessibility to the general public and increase pvlib's utility.

Would pvlib's authors be interested in including such a model in pvlib?

@mikofski
Copy link
Member

Would Matthew Lave's wavelet variability model (wvm) also be applicable?

@mikofski
Copy link
Member

mikofski commented Oct 15, 2019

@mikofski
Copy link
Member

@wholmgren
Copy link
Member

Yes I support adding this kind of model to pvlib.

@kevinsa5
Copy link
Contributor Author

kevinsa5 commented Oct 28, 2019

@mikofski Good point that spatial variability is relevant as well. I'm less familiar with the spatial correlation methods, but have the impression that wavelet models and cloud field models tend to be popular. I'm interested mostly in time variability so that's what I'll focus on here.

I've seen three approaches that I'd classify as "simple" and relatively easy to add to pvlib: Markov chain-based generators, lookup tables, and distribution sampling. The MC generator and lookup table methods require a high-frequency timeseries input dataset to compute the MTMs/lookup tables, while other methods (in particular the distribution sampling methods) are "already trained" in that they are parameterized by e.g. location and scale parameters (with suggested default values) and do not necessarily require a high-frequency irradiance training dataset. Given that pvlib has methods for retrieving high-frequency irradiance datasets (from e.g. SURFRAD), I'd say that requiring a high-frequency training dataset isn't a show-stopper for including those functions in pvlib. Thoughts?

Also, any suggestions for how to write tests for these sorts of functions? The stochastic nature of these methods does not lend itself to the usual direct numeric comparisons. The papers tend to focus on characterizing the generated signals as a whole like comparing their distributions against an expected curve with the 2-sample Kolmogorov-Smirnov test or by comparing autocorrelation against expected, etc.

Here's some additional references:

An N-state Markov-chain mixture distribution model of the clear-sky index
https://www.sciencedirect.com/science/article/pii/S0038092X18307205#bb0110

And a spatial extension of the above:
A spatiotemporal Markov-chain mixture distribution model of the clear-sky index
https://www.sciencedirect.com/science/article/pii/S0038092X18312611#bb0150

A simple and efficient procedure for increasing the temporal resolution of global horizontal solar irradiance series
https://www.sciencedirect.com/science/article/pii/S0960148115302044#bib23

Improved Synthesis of Global Irradiance with One-Minute Resolution for PV System Simulations
https://www.hindawi.com/journals/ijp/2014/808509/

STOCHASTIC DOWNSCALING ALGORITHM TO GENERATE HIGH-RESOLUTION TIME-SERIES FOR IMPROVED PV YIELD SIMULATIONS
https://suntrace.de/fileadmin/user_upload/Duscha_C._A.__Buehler_S.A.__Lezaca_J._Bohny_C.__Meyer_R._Stochastic_Downscaling_Algorithm_to_generate_high_resolution_time-series_for_improved_PV_yield_simulations_2016_PVSEC_2016_.pdf

@cwhanse
Copy link
Member

cwhanse commented Oct 28, 2019

My colleague Matt Lave spent some time looking at various downscaling algorithms/datasets, to create input to electrical distribution system simulations. His conclusion is that statistical downscaling methods tends to underestimate variability when compared to high-frequency measurements. The root case is likely that the algorithms mix too rapidly (or, have too short of memory), because otherwise they become computationally impractical. The methods that process windows of satellite data (the HRIA, for example) also suffered from this tendency.

I'm not against creating a library of such algorithms, but perhaps would be better to set up a separate project than to build a few of the algorithms into pvlib. Downscaling irradiance is very much a topic of research with no convergence to a few, "good" answers.

@wholmgren
Copy link
Member

I'm not against creating a library of such algorithms, but perhaps would be better to set up a separate project than to build a few of the algorithms into pvlib. Downscaling irradiance is very much a topic of research with no convergence to a few, "good" answers.

In general I don't see why these algorithms are any less appropriate than the myriad of transposition or airmass models, nor are they are not too domain specific. Reference implementations could be beneficial for further algorithm development.

What I don't want see:

  1. trivial wrappers around functions from packages like sklearn or statsmodels that don't add significant value to the PV modeler.
  2. implementations of statistical models that are already in other packages

I skimmed a few of those references and came away thinking that most of the potential pvlib functions would fall into 1 or 2. @kevinsa5 can you get more specific about what you think belongs in the pvlib modules and what might be better addressed through, say, documentation examples?

@kevinsa5
Copy link
Contributor Author

I see Cliff's point that the current state of this research area is fairly disorganized and likely to evolve in the future. I'd argue that, while these algorithms are flawed, an accessible implementation of one or more of them would still be useful in many contexts. As a point of comparison, bifacial modeling is in a similar situation.

What I'm envisioning is some number of low-level downscaling functions and a high-level function that would fit into pvlib's irradiance modeling chain, eg fetch ghi -> downscale -> decomposition -> transposition. It could be that some model implementations might also require training functions to generate MTMs, lookup tables, or other model dependencies.

@wholmgren I'm surprised that you got the impression that some of them could be implemented as trivial wrappers around the pydata stack. For instance, the Fernández-Peruchena/Gastón 2016 method described in "A simple and efficient procedure for increasing the temporal resolution of global horizontal solar irradiance series" goes something like:

# create library of high-res kt signals from a high-res measured ghi signal
def generate_library(ghi_meas, ghi_et):
    # calculate clearness index
    # partition by day
    # normalize time axis of each day's kt signal
    # return normalized kt library

# use library to generate a high-res signal from a low-res signal
def synthesize_variability(ghi, library):
    # for each day:
    #     apply kt library to that day's ghi_et
    #     choose the kt*ghi_et day that most closely recreates the measured ghi signal at native resolution

If the models were as simple as ghi_out = ghi_in * np.random.beta(1, 10, len(ghi_in)) then I'd say a quick documentation example would be fine, but it seems like these are complex enough that a full function would be the better choice.

@mikofski
Copy link
Member

@kevinsa5 sorry to diverge, but as valuable to me as high frequency subhourly data, would be synthesizing hourly TMY from monthly, which is more or less the same thing. But the problem I found is in getting a distribution of kt relevant for the site of interest. If I had that then why would I be downscaling monthly? You mentioned earlier that there are some models that

are "already trained" in that they are parameterized by e.g. location and scale parameters (with suggested default values) and do not necessarily require a high-frequency irradiance training dataset.

If pvlib has access to these parameters and could synthesize TMY data for any site globally, that would be awesome!

@jranalli
Copy link
Contributor

I needed to port the discrete point cloud case of the Matlab WVM model to Python and have a working implementation for that case. This is essentially reducing the variability rather than increasing it, and it's also sort of an auxiliary package to the Matlab library, but I saw the mention of the model above.

Would completing this port for the other cases in the Matlab library (relatively easy task) be desirable for contribution to this case?

@cwhanse
Copy link
Member

cwhanse commented Oct 30, 2019

@jar339 I support porting the WVM to python. Question is where.

@wholmgren Put it in irradiance.py? That module is nearly 3000 lines and could already use a refactor. Open to ideas.

@wholmgren
Copy link
Member

I support porting the WVM to python. Question is where.

I agree, but I don't have an opinion on where to put it at this time. Maybe irradiance.py would be ok if some of the other stuff was moved elsewhere. Maybe put it in its own module.

I'm surprised that you got the impression that some of them could be implemented as trivial wrappers around the pydata stack.

@kevinsa5 you can attribute that to my lack of familiarity with the models and my lack of time to review the papers. Thanks for the concrete example. That helps a lot.

@jranalli
Copy link
Contributor

I support porting the WVM to python. Question is where.

I agree, but I don't have an opinion on where to put it at this time. Maybe irradiance.py would be > ok if some of the other stuff was moved elsewhere. Maybe put it in its own module.

For what it's worth, there's very little in the model that would be similar to the irradiance module. It's mostly computing a distribution representing the plant, and then the wavelet transform. Personally, I think it makes the most sense in a framework of general variability/frequency, or else spatial analysis. Maybe if one of those is likely to get future contributions, it might be reasonable.

Should I spin this off to a separate issue, since it might be different (and more compartmented) than the broader downscaling discussion?

@cwhanse
Copy link
Member

cwhanse commented Oct 31, 2019

Should I spin this off to a separate issue, since it might be different (and more compartmented) than the broader downscaling discussion?

Yes. Let's start a new module with this submission, scaling.py comes to mind, but I'm not enamored of it. Scope will be functions that operate on irradiance, perhaps other variables, to transform temporal or spatial characteristics.

@mikofski
Copy link
Member

ICYMI @williamhobbs didn’t you write a paper in which you observed different magnitude hourly modeling errors between synthetic sub-hourly data and real?

@AdamRJensen
Copy link
Member

AdamRJensen commented Oct 30, 2024

Miguel Caminero from the University of Seville has code for the downscaling method described in the paper below and is willing to share. The paper is a different approach than that of A.P. Grantham; Migue's paper compares both approaches.

Methodology to synthetically downscale DNI time series from 1-h to 1-min temporal resolution with geographic flexibility. En: Solar Energy. 2018. Vol. 162. Pag. 573-584. https://doi.org/10.1016/j.solener.2018.01.064

@jranalli
Copy link
Contributor

jranalli commented Oct 30, 2024

I am in the process of implementing Matt Lave's cloud-field based downscaling in SolarSpatialTools. I have a working draft in a development branch, but am finalizing a demo. Happy to discuss whether that would be of interest to contribute to pvlib instead, but I was going to finish the draft implementation first.

M. Lave, M. J. Reno and R. J. Broderick, "Creation and Value of Synthetic High-Frequency Solar Inputs for Distribution System QSTS Simulations," 2017 IEEE 44th Photovoltaic Specialist Conference (PVSC), Washington, DC, USA, 2017, pp. 3031-3033, doi: 10.1109/PVSC.2017.8366378.`

I was hoping to look at Widen's copula based downscaling next, but that's potentially further out based on my time available.

J. Widén and J. Munkhammar, "Spatio-Temporal Downscaling of Hourly Solar Irradiance Data Using Gaussian Copulas," 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC), Chicago, IL, USA, 2019, pp. 3172-3178, doi: 10.1109/PVSC40753.2019.8980922.

@Miguellarraneta
Copy link

Here Miguel Larrañeta. I did my PhD thesis on downscaling solar data (from hourly to 1-min) mainly focused in CSP systems so DNI but also GHI. I tested several methods and come up with the ND-model as the best solution. I´m open to provide the scripts and discuss possible collaborations on the topic. I expect to work on AI algorithms for downscaling in the next year.

@williamhobbs
Copy link
Contributor

ICYMI @williamhobbs didn’t you write a paper in which you observed different magnitude hourly modeling errors between synthetic sub-hourly data and real?

Yes, with Chloe Black, @wholmgren, and @kandersolar. https://doi.org/10.1109/pvsc48320.2023.10359541 (pre-print: https://dx.doi.org/10.36227/techrxiv.23837340.v2). To keep the commercial datasets anonymous, we unfortunately couldn't say which results came from which adjustment methods.

One conclusion was that different sources/methods give notably different results when it comes to representing subhourly inverter clipping losses. I'd consider adding some sort of accuracy "disclaimer" to any downscaling method implemented in pvlib.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants