Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use modified EDGAR diurnal scale factors file to correctly read data #1255

Closed

Conversation

lizziel
Copy link
Contributor

@lizziel lizziel commented May 17, 2022

This PR updates the GC-Classic and GCHP config files to use new file EDGARv42/v2015-02/NO/EDGAR_hourly_NOxScale.modified.nc, which is a duplicate of the file currently used but with all missing data removed and replaced with value 1.0.

The currently used file, EDGARv42/v2015-02/NO/EDGAR_hourly_NOxScale.nc, uses missing values and _FillValue attribute value of 1 to indicate where scale factors are 1.0. This causes a silent bug in GC-Classic where near-zero values are used instead of 1.0 since HEMCO treats all missing values as 0 rather than using the _FillValue (see here, here, here, and here.

This also causes a run fail in GCHP if a new _FillValue update is enabled in the new version of MAPL we just updated to. This is because data with values equivalent to _FillValue are now treated as missing and not included in regridding. For this reason that fix is currently disabled MAPL inGCHP. We will enable it following this update (geoschem/MAPL#21).

This update causes small differences in GC-Classic. No differences are expected for GCHP.

The EDGARv42/v2015-02/NO/EDGAR_hourly_NOxScale.nc file uses missing values
and a _FillValue attribute value of 1 to indicate where scale factors are
1.0. This causes a silent bug in GC-Classic where zero values are used
instead since HEMCO treats all missing values as 0 rather than use the
_FillValue.

This also causes a run fail in GCHP after the recent MAPL update if the
_FillValue fix included is enabled. This is because data with values
equivalent to _FillValue are now treated as missing and not included in
regridding. For this reason that fix is currently disabled MAPL. We will
enable it following this update.

This commit updates the GC-Classic and GCHP config files to use
EDGARv42/v2015-02/NO/EDGAR_hourly_NOxScale.modified.nc, which is a
duplicate of the non-modified file but with all missing data removed and
replaced with value 1.0.

This update causes small differences in GC-Classic. No differences are
expected for GCHP.

Signed-off-by: Lizzie Lundgren <elundgren@seas.harvard.edu>
@lizziel lizziel added the category: Bug Something isn't working label May 17, 2022
@lizziel lizziel requested a review from msulprizio May 17, 2022 23:03
@lizziel lizziel added this to the 14.0.0 milestone May 17, 2022
Copy link
Contributor

@msulprizio msulprizio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update is simply replacing the file used for EDGAR_TODNOX. I have no objections, but would like to request that you attach an emissions table and difference plots here quantifying the impact of this fix on GCClassic output.

@lizziel lizziel changed the title Use modified EDGAR diurnal scale factors file to correctly read data [WIP] Use modified EDGAR diurnal scale factors file to correctly read data May 18, 2022
@lizziel lizziel marked this pull request as draft May 18, 2022 16:30
@lizziel lizziel removed this from the 14.0.0 milestone May 18, 2022
@lizziel
Copy link
Contributor Author

lizziel commented May 18, 2022

I changed this PR to draft status since I would like to investigate the HEMCO missing value handling further. I specifically want to better understand the differences introduced by using scale factor of 1 rather than missing value. I found code in HEMCO that assigns a value of 1 for scale factor values equal to the HEMCO definition of missing value. This would suggest the original file should work okay. The differences introduced using the modified file may be due to handling of missing values in regridding. If this is the case there needs to be further investigation of which handling is better for scale factors, using missing values or data values of 1.

I also found a kludge in HEMCO for one of the missing values handling if using GCHP (see here). This should be revisited given the new missing value handling in MAPL. It is also possible additional special handling of missing values for GCHP is needed in HEMCO.

@lizziel lizziel changed the title [WIP] Use modified EDGAR diurnal scale factors file to correctly read data Use modified EDGAR diurnal scale factors file to correctly read data May 26, 2022
@stale
Copy link

stale bot commented Jun 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the Stale bot from closing this issue.

@stale stale bot added the stale No recent activity on this issue label Jun 27, 2022
@lizziel lizziel removed the stale No recent activity on this issue label Jun 28, 2022
@stale
Copy link

stale bot commented Jul 30, 2022

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the Stale bot from closing this issue.

@stale stale bot added the stale No recent activity on this issue label Jul 30, 2022
@msulprizio msulprizio added never stale Never label this issue as stale and removed stale No recent activity on this issue labels Aug 1, 2022
@msulprizio msulprizio added this to the 14.1.0 milestone Aug 1, 2022
@lizziel lizziel self-assigned this Aug 11, 2022
@lizziel
Copy link
Contributor Author

lizziel commented Oct 17, 2022

I am closing this draft PR. After further investigation and discussion with @christophkeller, I decided that we should not merge it in. Using data values of 1.0 instead of missing values when reading and regridding the EDGAR scale factors will dilute the scale factors that are adjacant to cells marked as missing. This is particularly true along the coast.

The current GC-Classic handling is correct by setting all input values that are equal to the file _FillValue attribute to HCO_MISSVAL, and then skipping those grid cells during regrid interpolation. Later, during the GEOS-Chem run, all values equal to HCO_MISSVAL are set to either 0 (for masks and emission levels) or 1 (for scale factors) every timestep prior to emissions retrieval.

We therefore should keep using the current EDGAR diurnal scale factors file which uses missing values rather than data value 1.0. It happens that the _FillValue is 1.0 for this file, but that value is not actually used beyond identifying the cells with missing data.

GCHP will require a separate fix to accompany the MAPL update that identifies input values equal to _FillValue as missing. This will be an update in HEMCO only.

@lizziel lizziel closed this Oct 17, 2022
@msulprizio msulprizio removed this from the 14.1.0 milestone Oct 17, 2022
@msulprizio msulprizio deleted the bugfix/Edgar_diurnal_scale_factors_with_FillValue_fix branch January 11, 2023 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working never stale Never label this issue as stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants