Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor CMAQ ingest to be more efficient / faster #20

Open
mjstealey opened this issue Jan 16, 2018 · 0 comments
Open

refactor CMAQ ingest to be more efficient / faster #20

mjstealey opened this issue Jan 16, 2018 · 0 comments
Assignees

Comments

@mjstealey
Copy link
Member

  • Existing implementation works well for small sets, but takes weeks (or longer) to ingest data formatted at the 299 x 459 grid size

Example:

Processes started to ingest for 2011 CMAQ data:

25124 Sat Dec 23 20:24:15 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25268 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25269 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25270 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25271 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25272 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25273 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25274 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25275 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25276 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25277 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_
25278 Sat Dec 23 20:40:27 2017 ./venv/bin/python ./ingest-cmaq-file.py /projects/datatrans/CMAQ/2011/raw/CCTM_CMAQ_v51_Release_Oct23_NoDust_

As of 2018-01-16, the most populated set is January, 2011 with 90 of 459 columns completed. The rest of the months are between 21 and 24 of 459 completed.

psql (9.6.6)
Type "help" for help.

cmaq=# select utc_date_time::date, max(row), max(col) from exposure_data where utc_date_time::date >= '2011-01-01' group by utc_date_time::date order by utc_date_time::date;
 utc_date_time | max | max
---------------+-----+-----
 2011-01-01    | 299 |  90
 2011-01-02    | 299 |  90
 2011-01-03    | 299 |  90
 2011-01-04    | 299 |  90
 2011-01-05    | 299 |  90
 2011-01-06    | 299 |  90
 2011-01-07    | 299 |  90
 2011-01-08    | 299 |  90
 2011-01-09    | 299 |  90
 2011-01-10    | 299 |  90
 2011-01-11    | 299 |  90
 2011-01-12    | 299 |  90
 2011-01-13    | 299 |  90
 2011-01-14    | 299 |  90
 2011-01-15    | 299 |  90
 2011-01-16    | 299 |  90
 2011-01-17    | 299 |  90
 2011-01-18    | 299 |  90
 2011-01-19    | 299 |  90
 2011-01-20    | 299 |  90
 2011-01-21    | 299 |  90
 2011-01-22    | 299 |  90
 2011-01-23    | 299 |  90
 2011-01-24    | 299 |  90
 2011-01-25    | 299 |  90
 2011-01-26    | 299 |  90
 2011-01-27    | 299 |  90
 2011-01-28    | 299 |  90
 2011-01-29    | 299 |  90
 2011-01-30    | 299 |  90
 2011-01-31    | 299 |  90
 2011-02-01    | 299 |  90
 2011-02-02    | 299 |  24
 2011-02-03    | 299 |  24
 2011-02-04    | 299 |  24
 2011-02-05    | 299 |  24
 2011-02-06    | 299 |  24
 2011-02-07    | 299 |  24
...
 2011-12-22    | 299 |  21
 2011-12-23    | 299 |  21
 2011-12-24    | 299 |  21
 2011-12-25    | 299 |  21
 2011-12-26    | 299 |  21
 2011-12-27    | 299 |  21
 2011-12-28    | 299 |  21
 2011-12-29    | 299 |  21
 2011-12-30    | 299 |  21
 2011-12-31    | 299 |  21
 2012-01-01    | 299 |  21
(366 rows)
lstillwe added a commit that referenced this issue Feb 5, 2018
… chunks. Ingest of 1 month of CMAQ 2010 data has improved from ~36 hours to 24 minutes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants