Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use Cmr-search-after instead of paging #87

Merged
merged 51 commits into from
Feb 9, 2022
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
94acb79
Let travis test everything in icepyx
weiji14 Jun 4, 2020
74dd0e3
Fix test_granule_info
weiji14 Jun 4, 2020
929594e
Fix test_correct_granule_list_returned
weiji14 Jun 8, 2020
8026be1
Reformat CONTRIBUTORS.rst, add myself, fix typo
dshean Jun 16, 2020
58ed1ea
Test everything except behind_NSIDC_API_login.py
weiji14 Jun 16, 2020
16f83e8
add pr builds to travis
JessicaS11 Jun 16, 2020
8da8239
Merge branch 'development' into development
JessicaS11 Jun 16, 2020
f4aee1d
Merge pull request #76 from dshean/development
JessicaS11 Jun 16, 2020
3319680
Add self to contributors
Jun 18, 2020
3760d43
Replace paging with scrolling in granule cmr query
Jun 18, 2020
a9bbdd2
Update place_order method paging
Jun 18, 2020
e49e1b8
Cleanup in granules
Jun 18, 2020
8925f75
Fix errors in tests
Jun 18, 2020
5e4187a
Rename and update tests
Jun 18, 2020
a951f6f
Remove datetime from requirements (builtin)
Jun 18, 2020
fcbe215
codeblock defined as none is not compatible with pypi markup
loudTom Jun 18, 2020
823cf34
Added T Johnson to list of contributors
loudTom Jun 18, 2020
aefbf6f
Add codecov
Jun 18, 2020
e948fa6
Updated Contributers, added my name
annavalentine Jun 18, 2020
9a7eab6
Add dev requirements and specify travis install
Jun 18, 2020
f6c064f
Merge branch 'development' into cmr-scrolling-search
wallinb Jun 18, 2020
f66388e
Update CONTRIBUTORS.rst
wallinb Jun 18, 2020
b71d574
Fix codecov badge links (use icesat2py repo)
Jun 18, 2020
35444ec
Updated installation instructions
loudTom Jun 18, 2020
96bab86
Changed installation instructions on readthedocs
loudTom Jun 18, 2020
ca0b4fe
Merge pull request #84 from icesat2py/anna_dev2
JessicaS11 Jun 18, 2020
7cd1795
Merge branch 'development' into codecov
JessicaS11 Jun 18, 2020
7bf566c
Merge pull request #88 from wallinb/codecov
JessicaS11 Jun 18, 2020
9f25d59
Merge branch 'development' into tj_dev
JessicaS11 Jun 18, 2020
67619f8
Merge pull request #85 from icesat2py/tj_dev
JessicaS11 Jun 18, 2020
9809159
Update codecov badge to point to 'development' branch
wallinb Jun 18, 2020
8c89907
Merge branch 'development' into cmr-scrolling-search
wallinb Jun 23, 2020
709cbd0
Merge branch 'development' into tj_docs
JessicaS11 Jun 24, 2020
62ce481
Merge pull request #89 from icesat2py/tj_docs
JessicaS11 Jun 24, 2020
16466f8
Merge branch 'development' into readme-fix
JessicaS11 Jun 24, 2020
98e68fa
Merge pull request #90 from icesat2py/readme-fix
JessicaS11 Jun 24, 2020
9f5e8a4
Merge branch 'development' into travis_test_all
JessicaS11 Jun 26, 2020
82f4aec
Merge pull request #61 from icesat2py/travis_test_all
JessicaS11 Jun 26, 2020
af4fc4d
Merge branch 'development' into cmr-scrolling-search
JessicaS11 Jun 26, 2020
79cc421
Merge branch 'development' into cmr-scrolling-search
JessicaS11 Feb 2, 2022
b37be51
fix missed merge conflicts
JessicaS11 Feb 2, 2022
5119cdb
manual updates from commit history
JessicaS11 Feb 2, 2022
009755f
fix missed merge conflicts
JessicaS11 Feb 2, 2022
da465bd
turn 'scroll' into a reqparam for CMR searches
JessicaS11 Feb 2, 2022
a68d71f
switch from scrolling to CMR-Search-After
JessicaS11 Feb 2, 2022
e9a7d2a
remove missed scroll kwargs
JessicaS11 Feb 2, 2022
a67848b
remove changes to example I had checked out but committed anyway
JessicaS11 Feb 2, 2022
4c8e379
add ability to only order one page by specifying 'page-num'
JessicaS11 Feb 3, 2022
02481ae
Merge branch 'development' into cmr-scrolling-search
JessicaS11 Feb 4, 2022
fe9ca49
GitHub action UML generation auto-update
betolink Feb 8, 2022
fa8bb68
undo unintentional example commit
JessicaS11 Feb 3, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ language: python

stages:
- name: basic tests
if: type = push
if: type = push OR type = pull_request
- name: behind Earthdata
if: branch = master OR commit_message =~ nsidc_tests OR type = cron

Expand Down
24 changes: 13 additions & 11 deletions CONTRIBUTORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,17 @@ Project Contributors

The following people have made contributions to the project (in alphabetical
order by last name) and are considered "The icepyx Developers":
* [Anthony Arendt](https://github.com/aaarendt/) - University of Washington
* [Shashank Bhushan](https://github.com/ShashankBice) - University of Washington
* [Raphael Hagen](https://github.com/norlandrhagen) - University of Washington
* [Scott Henderson](https://github.com/scottyhq) - University of Washington
* [Zheng Liu](https://github.com/liuzheng-arctic) - University of Washington
* [Joakim Meyer](https://github.com/jomey) - University of Utah
* [Fernando Perez](https://github.com/fperez) - University of California, Berkeley
* [Jessica Scheick](https://github.com/jessicas11) - Unaffiliated (ORCID: [0000-0002-3421-4459](https://www.orcid.org/0000-0002-3421-4459))
* [Ben Smith](https://github.com/smithb) - University of Washington
* [Amy Steiker](https://github.com/asteiker) - NSIDC, University of Colorado
* [Bidhyananda Yadav](https://github.com/bidhya) - Ohio State University

* `Anthony Arendt <https://github.com/aaarendt/>`_ - University of Washington
* `Shashank Bhushan <https://github.com/ShashankBice>`_ - University of Washington
* `Raphael Hagen <https://github.com/norlandrhagen>`_ - University of Washington
* `Scott Henderson <https://github.com/scottyhq>`_ - University of Washington
* `Zheng Liu <https://github.com/liuzheng-arctic>`_ - University of Washington
* `Joachim Meyer <https://github.com/jomey>`_ - University of Utah
* `Fernando Perez <https://github.com/fperez>`_ - University of California, Berkeley
* `Jessica Scheick <https://github.com/jessicas11>`_ - Unaffiliated (ORCID: `0000-0002-3421-4459 <https://www.orcid.org/0000-0002-3421-4459>`_)
* `David Shean <https://github.com/dshean>`_ - University of Washington
* `Ben Smith <https://github.com/smithb>`_ - University of Washington
* `Amy Steiker <https://github.com/asteiker>`_ - NSIDC, University of Colorado
* `Bidhyananda Yadav <https://github.com/bidhya>`_ - Ohio State University
* _Bruce Wallin <https://github.com/wallinb> - University of Colorado - NSIDC
81 changes: 44 additions & 37 deletions icepyx/core/granules.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,45 +89,55 @@ def get_avail(self, CMRparams, reqparams):
-----
This function is used by icesat2data.Icesat2Data.avail_granules(), which automatically
feeds in the required parameters.

See Also
--------
APIformatting.Parameters
icesat2data.Icesat2Data.avail_granules
"""

assert CMRparams is not None and reqparams is not None, "Missing required input parameter dictionaries"
assert (CMRparams is not None and reqparams is not None), "Missing required input parameter dictionaries"

# if not hasattr(self, 'avail'):
self.avail=[]
# if not hasattr(self, 'avail'):
self.avail = []

granule_search_url = 'https://cmr.earthdata.nasa.gov/search/granules'

headers={'Accept': 'application/json'}
headers = {'Accept': 'application/json'}
#DevGoal: check the below request/response for errors and show them if they're there; then gather the results
#note we should also do this whenever we ping NSIDC-API - make a function to check for errors
params = apifmt.combine_params(
CMRparams, {k: reqparams[k] for k in ['page_size']}
)
params['scroll'] = 'true'

cmr_scroll_id = None
while True:
response = requests.get(granule_search_url, headers=headers,\
params=apifmt.combine_params(CMRparams,\
{k: reqparams[k] for k in ('page_size','page_num')}))
if cmr_scroll_id is not None:
headers['CMR-Scroll-Id'] = cmr_scroll_id

results = json.loads(response.content)
response = requests.get(granule_search_url, headers=headers, params=params)

# print(results)

if len(results['feed']['entry']) == 0:
# Out of results, so break out of loop
if cmr_scroll_id is None:
hits = int(response.headers['CMR-Hits'])

cmr_scroll_id = response.headers['CMR-Scroll-Id']

results = json.loads(response.content)
granules = results['feed']['entry']
if not granules:
# Done scrolling
assert (
len(self.avail) == hits
), 'Search failure - unexpected number of results'
break

# Collect results and increment page_num
self.avail.extend(results['feed']['entry'])
reqparams['page_num'] += 1
self.avail.extend(granules)

#DevNote: The above calculated page_num is wrong when mod(granule number, page_size)=0.
# print(reqparams['page_num'])
reqparams['page_num'] = int(np.ceil(len(self.avail)/reqparams['page_size']))

assert len(self.avail)>0, "Your search returned no results; try different search parameters"
assert (
len(self.avail) > 0
), "Your search returned no results; try different search parameters"


#DevNote: currently, default subsetting DOES NOT include variable subsetting, only spatial and temporal
Expand Down Expand Up @@ -182,29 +192,26 @@ def place_order(self, CMRparams, reqparams, subsetparams, verbose,
base_url = 'https://n5eil02u.ecs.nsidc.org/egi/request'
#DevGoal: get the base_url from the granules?


self.get_avail(CMRparams, reqparams) #this way the reqparams['page_num'] is updated
self.get_avail(CMRparams, reqparams)

if subset is False:
request_params = apifmt.combine_params(CMRparams, reqparams, {'agent':'NO'})
request_params = apifmt.combine_params(CMRparams, reqparams, {'agent': 'NO'})
else:
request_params = apifmt.combine_params(CMRparams, reqparams, subsetparams)

order_fn = '.order_restart'

print('Total number of data order requests is ',request_params['page_num'], ' for ',len(self.avail), ' granules.')
#DevNote/05/27/20/: Their page_num values are the same, but use the combined version anyway.
#I'm switching back to reqparams, because that value is not changed by the for loop. I shouldn't cause an issue either way, but I've had issues with mutable types in for loops elsewhere.
for i in range(reqparams['page_num']):
# for i in range(request_params['page_num']):
page_val = i + 1

print('Data request ', page_val, ' of ', reqparams['page_num'],' is submitting to NSIDC')
request_params.update( {'page_num': page_val} )

#DevNote: earlier versions of the code used a file upload+post rather than putting the geometries
#into the parameter dictionaries. However, this wasn't working with shapefiles, but this more general
#solution does, so the geospatial parameters are included in the parameter dictionaries.
total_pages = int(np.ceil(len(self.avail) / reqparams['page_size']))
print('Total number of data order requests is ',total_pages, ' for ',len(self.avail), ' granules.')
for page_num in range(1, total_pages+1):

print('Data request ',page_num,' of ',total_pages,' is submitting to NSIDC',)
request_params = apifmt.combine_params(CMRparams, reqparams, subsetparams)
request_params['page_num'] = page_num

# DevNote: earlier versions of the code used a file upload+post rather than putting the geometries
# into the parameter dictionaries. However, this wasn't working with shapefiles, but this more general
# solution does, so the geospatial parameters are included in the parameter dictionaries.
request = session.get(base_url, params=request_params)

#DevGoal: use the request response/number to do some error handling/give the user better messaging for failures
Expand Down Expand Up @@ -401,4 +408,4 @@ def download(self, verbose, path, session=None, restart=False):
if os.path.exists( downid_fn ): os.remove(downid_fn)

print('Download complete')


12 changes: 6 additions & 6 deletions icepyx/tests/test_granules.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
def test_granules_info():
# reg_a = ipd.Icesat2Data('ATL06', [-55, 68, -48, 71], ['2019-02-20','2019-02-24'], version='3')
# granules = reg_a.granules.avail
granules = [{'producer_granule_id': 'ATL06_20190221121851_08410203_003_01.h5',
entries = [{'producer_granule_id': 'ATL06_20190221121851_08410203_003_01.h5',
'time_start': '2019-02-21T12:19:05.000Z',
'orbit': {'ascending_crossing': '-40.35812957405553',
'start_lat': '59.5',
Expand Down Expand Up @@ -386,10 +386,10 @@ def test_granules_info():
'rel': 'http://esipfed.org/ns/fedsearch/1.1/documentation#',
'hreflang': 'en-US',
'href': 'https://doi.org/10.5067/ATLAS/ATL06.003'}]}]
obs = granules.info(granules)
obs = granules.info(entries)

exp = {'Number of available granules': 2, /
'Average size of granules (MB)': 46.49339485165, /
exp = {'Number of available granules': 2,
'Average size of granules (MB)': 46.49339485165,
'Total size of all granules (MB)': 92.9867897033}

assert obs==exp
Expand All @@ -408,7 +408,7 @@ def test_no_granules_in_search_results():
def test_correct_granule_list_returned():
reg_a = ipd.Icesat2Data('ATL06',[-55, 68, -48, 71],['2019-02-20','2019-02-28'], version='2')
reg_a.avail_granules()
obs_grans = [gran['producer_granule_id'] for gran in reg_a.granules]
obs_grans = [gran['producer_granule_id'] for gran in reg_a.granules.avail]
exp_grans = ['ATL06_20190221121851_08410203_002_01.h5', 'ATL06_20190222010344_08490205_002_01.h5', 'ATL06_20190225121032_09020203_002_01.h5', 'ATL06_20190226005526_09100205_002_01.h5']

assert set(obs_grans) == set(exp_grans)
assert set(obs_grans) == set(exp_grans)
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
from icepyx import is2class as ipd
import icepyx.core.icesat2data as ipd
import pytest
import warnings

def test_CMRparams():
reg_a = ipd.Icesat2Data('ATL06',[-64, 66, -55, 72],['2019-02-22','2019-02-28'])
reg_a.build_CMR_params()
obs_keys = reg_a.CMRparams.keys()
exp_keys_all = ['short_name','version','temporal']
exp_keys_any = ['bounding_box','polygon']
Expand All @@ -16,17 +15,14 @@ def test_reqconfig_params():
reg_a = ipd.Icesat2Data('ATL06',[-64, 66, -55, 72],['2019-02-22','2019-02-28'])

#test for search params
reg_a.build_reqconfig_params('search')
obs_keys = reg_a.reqparams.keys()
exp_keys_all = ['page_size','page_num']
assert all(keys in obs_keys for keys in exp_keys_all)

#test for download params
reg_a.reqparams=None
reg_a.build_reqconfig_params('download')
reg_a.reqparams.update({'token':'','email':''})
obs_keys = reg_a.reqparams.keys()
exp_keys_all = ['page_size','page_num','request_mode','token','email','include_meta']
exp_keys_all = ['page_size','page_num','token','email']
assert all(keys in obs_keys for keys in exp_keys_all)

def test_properties():
Expand All @@ -44,4 +40,4 @@ def test_properties():



#check that search results are correct (spatially, temporally, match actually available data)
#check that search results are correct (spatially, temporally, match actually available data)
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
datetime
fiona
geopandas
h5py
Expand Down