-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functions to read and retrieve SolarAnywhere irradiance data #1497
Conversation
@KWagnerCPR It would be great if you could test the functions and give some feedback on the default values. 😄 |
@AdamRJensen Yes, I'll have a few people on our team test it out and get back to you. Which default values are you referring to? |
@KWagnerCPR In regards to default values I'm thinking of these:
Also, I have chosen not to parse any annual and monthly data, as that seems to be outside the scope of pvlib and would just unnecessarily complicate the function. Furthermore, the |
In terms of default values, I would recommend the following: With SolarAnywhere Public, users get access to 1 km (0.01 x 0.01 degree) spatial resolution data, so I think it makes sense to set the default as 0.01 if our focus is on time series requests rather than TGY requests. Also, using the "FillAverage" default for missing data will ensure periods of missing satellite imagery are filled with long term averages for those times, which is likely preferable. You should be able to request PXX data as a part of the SolarAnywhere public license. The issue might have been that you had the spatial resolution set to 0.1 when you requested it. You also have to include "ProbabilityOfExceedance": 90, in the request. I think it makes sense not to request the monthly and annual totals at this time. |
@KWagnerCPR I have updated the default values accordingly. Thanks for the feedback! 😄 Also, I figured out how to work with PXX data, so I've made some modifications to handle that and added a keyword called |
@pvlib/pvlib-maintainer I am thinking of developing primarily mock tests for the |
@AdamRJensen We've tested these functions and they are working well! One thing I noticed is that when I request time series data (WeatherDataSource = SolarAnywhereLatest) without specifying a start or end time, I don't get the error message you specify in the script: "'When requesting non-TMY data, specifying I think these will be really helpful for our customers as well as new SolarAnywhere users. I'd love to include links to these in our developers documentation. Do you know if these will get documented in the official pvlib documentation?: https://pvlib-python.readthedocs.io/en/stable/index.html |
@KWagnerCPR Thank you for catching this error. I had forgotten to raise the error, hence nothing was happening - it was been corrected.
Once the functions get merged and we make a new release of pvlib, the documentation will be available publicly. We're releasing version 0.9.2 within a few weeks, but I can't promise that this pull request will be merged by then. |
@AdamRJensen Thank you! I'll stay tuned. |
When downloading data from the SolarAnywhere website, missing data may either be filled with average data or represented as missing using one of three options: Missing values are by default correctly converted to Numpy nan. Similarly, I belive that the 'NaN' option would also be correctly converted to Numpy nan since we're using pd.read_csv (though I haven't tested it). However, -999 will have to be converted to numpy nan by the @KWagnerCPR Do you have a file or a know station/time period where there is missing data that I could use for testing? |
I think the problem is this:
What to do about it? Modify the git diff upstream/main -- "*.py" | flake8 --exclude pvlib/version.py --ignore E201,E241,E226,W503,W504 --max-line-length 79 --diff |
This solved the issue. Thanks! |
Co-authored-by: Kevin Anderson <kevin.anderso@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, but this looks good to me! Note that I did not actually run any code.
pvlib/iotools/solaranywhere.py
Outdated
data.index = pd.to_datetime(data['ObservationTime(GMT)'], | ||
format='%m/%d/%Y %H:%M', utc=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data.index = pd.to_datetime(data['ObservationTime(GMT)'], | |
format='%m/%d/%Y %H:%M', utc=True) | |
data.index = pd.to_datetime(data['ObservationTime(LST)'], | |
format='%m/%d/%Y %H:%M') |
Maybe better to use the local standard time column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, I think the tests will need to be adjusted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right that LST is preferred. My main point for saying this is that it's much easier for a user to convert to UTC than to convert to LST,
solaranywhere_api_key = os.environ["SOLARANYWHERE_API_KEY"] | ||
has_solaranywhere_credentials = True | ||
except KeyError: | ||
has_solaranywhere_credentials = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why codecov is claiming that line 102 isn't covered. The CI logs show that the tests are running, so I don't see why this line wouldn't be getting hit. No action needed; just recording my confusion...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment about requirements start
and end
, rest are spelling-level nits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this, or is it fixing a mistake?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no change there so I can't remove the file... I think it's fine leaving it. @kandersolar leave a thumb if you have an opinion
pvlib/iotools/solaranywhere.py
Outdated
payload['Options']['ApplyTrueDynamics'] = True | ||
|
||
if probability_of_exceedance is not None: | ||
if type(probability_of_exceedance) != int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance
here? I don't know if type
== is preferrable.
I would type(probability_of_exceedance) == int
: and let all the else's raise the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to isinstance
probability_of_exceedance | ||
|
||
# Add start/end time if requesting non-TMY data | ||
if (start is not None) or (end is not None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should or
be and
? I'm guessing both start
and end
are required. If so, then I think we could use an else: raise ValueError
here, unless SolarAnywhere supplies some default times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user only specified start
or end
we would want them to get useful feedback, such as None
cannot be converted to timestamp which is what would happen now. If you use and
then the error message would first happen much later in the code and be more difficult to interpret.
Co-authored-by: Cliff Hansen <cwhanse@sandia.gov>
Ok, going ahead with the merge. Thanks @AdamRJensen! |
We greatly appreciate the work on this! |
docs/sphinx/source/reference
for API changes.docs/sphinx/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).remote-data
) and Milestone are assigned to the Pull Request and linked Issue.Description
This PR adds functions for reading and retrieving SolarAnywhere data as discussed in #1310. SolarAnywhere is a commercial dataset, although it provides irradiance data for a few locations for free.
Minor comments
The SolarAnywhere API is asynchronous, hence two requests are made for each data retrieval.
The
get_solaranywhere
function is currently not designed to handle annual and monthly data, as this seems less relevant to pvlib users.