Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/json custom datetime format #638

Merged
merged 4 commits into from
Sep 13, 2023

Conversation

jdddog
Copy link
Contributor

@jdddog jdddog commented Sep 8, 2023

No description provided.

@jdddog
Copy link
Contributor Author

jdddog commented Sep 8, 2023

@keegansmith21 it looks like these changes may have broken some of the tests on the academic observatory, so it would be good to check it for that and oaebu workflows before merging it.

======================================================================
ERROR: test_telescope (academic_observatory_workflows.workflows.tests.test_doi_workflow.TestDoiWorkflow)
Test the DOI telescope end to end.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/academic-observatory-workflows/academic-observatory-workflows/academic_observatory_workflows/workflows/tests/test_doi_workflow.py", line 379, in test_telescope
    bq_load_observatory_dataset(
  File "/home/runner/work/academic-observatory-workflows/academic-observatory-workflows/academic_observatory_workflows/model.py", line 1243, in bq_load_observatory_dataset
    bq_load_tables(
  File "/home/runner/work/academic-observatory-workflows/observatory-platform/observatory-platform/observatory/platform/observatory_environment.py", line 1219, in bq_load_tables
    exit(os.EX_CONFIG)
  File "/opt/hostedtoolcache/Python/3.8.17/x64/lib/python3.8/_sitebuiltins.py", line 26, in __call__
    raise SystemExit(code)
SystemExit: 78

======================================================================
FAIL: test_telescope (academic_observatory_workflows.workflows.tests.test_crossref_events_telescope.TestCrossrefEventsTelescope)
Test the Crossref Events telescope end to end.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/academic-observatory-workflows/academic-observatory-workflows/academic_observatory_workflows/workflows/tests/test_crossref_events_telescope.py", line 249, in test_telescope
    self.assert_table_content(release.bq_main_table_id, expected_content, workflow.primary_key)
  File "/home/runner/work/academic-observatory-workflows/observatory-platform/observatory-platform/observatory/platform/observatory_environment.py", line 889, in assert_table_content
    assert results, "Rows in actual content do not match expected content"
AssertionError: Rows in actual content do not match expected content

======================================================================
FAIL: test_telescope (academic_observatory_workflows.workflows.tests.test_openalex_telescope.TestOpenAlexTelescope)
Test the OpenAlex telescope end to end.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/academic-observatory-workflows/academic-observatory-workflows/academic_observatory_workflows/workflows/tests/test_openalex_telescope.py", line 797, in test_telescope
    self.assert_table_content(table_id, expected_data, "id")
  File "/home/runner/work/academic-observatory-workflows/observatory-platform/observatory-platform/observatory/platform/observatory_environment.py", line 889, in assert_table_content
    assert results, "Rows in actual content do not match expected content"
AssertionError: Rows in actual content do not match expected content

======================================================================
FAIL: test_telescope (academic_observatory_workflows.workflows.tests.test_unpaywall_telescope.TestUnpaywallTelescope)
Test workflow end to end.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/academic-observatory-workflows/academic-observatory-workflows/academic_observatory_workflows/workflows/tests/test_unpaywall_telescope.py", line 382, in test_telescope
    self.assert_table_content(release.bq_main_table_id, expected_content, "doi")
  File "/home/runner/work/academic-observatory-workflows/observatory-platform/observatory-platform/observatory/platform/observatory_environment.py", line 889, in assert_table_content
    assert results, "Rows in actual content do not match expected content"
AssertionError: Rows in actual content do not match expected content

----------------------------------------------------------------------
Ran 205 tests in 1382.782s

@keegansmith21 keegansmith21 force-pushed the feature/json_custom_datetime_format branch from 46af980 to a5172b0 Compare September 12, 2023 06:42
@keegansmith21
Copy link
Contributor

keegansmith21 commented Sep 12, 2023

@jdddog

I've had a look at the errors and their sources. Here is my examination of the issue:

All of the failures that I can see are due to type differences of datetime.date (expected) vs datetime.datetime (actual). The reason this happens is because any date field that is declared in the json parsing function is (now) converted to a datetime.date object (when it used to be a datetime.datetime object). This eliminates the time and timezone components of the object. I believe that this is the correct way of doing this as we require a distinction to be made between these two types. BQ drops the time/timestamp components if you upload a datetime (timestamp-like) object into a date field.

In the past, we have been using this same function to upload timestamps as BQ Timestamp fields without making the distinction between dates and timestamps, so all dates have actually been datetimes (timestamp-like). The three telescopes in AO that are failing have actually required that the timestamp remain intact as they use BQ Timestamp fields.

So, I can change the parsing function to work only with datetimes, but that would somewhat defeat the purpose of the distinction made by this PR. I think it would be better to alter the telescopes/tests to make the distinction between the date/timestamp fields. I am happy to do this.

Relevant PRs:

@codecov
Copy link

codecov bot commented Sep 12, 2023

Codecov Report

Patch coverage: 89.18% and project coverage change: +0.41% 🎉

Comparison is base (8aa0338) 86.91% compared to head (296a073) 87.32%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #638      +/-   ##
===========================================
+ Coverage    86.91%   87.32%   +0.41%     
===========================================
  Files           34       34              
  Lines         4654     4680      +26     
  Branches       563      572       +9     
===========================================
+ Hits          4045     4087      +42     
+ Misses         495      477      -18     
- Partials       114      116       +2     
Files Changed Coverage Δ
...rm/observatory/platform/observatory_environment.py 85.48% <89.18%> (+3.62%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jdddog jdddog marked this pull request as ready for review September 13, 2023 03:31
@jdddog
Copy link
Contributor Author

jdddog commented Sep 13, 2023

Thanks for the changes Keegan!

@jdddog jdddog enabled auto-merge September 13, 2023 03:33
@keegansmith21 keegansmith21 self-requested a review September 13, 2023 03:33
@jdddog jdddog added this pull request to the merge queue Sep 13, 2023
Merged via the queue into develop with commit b94c968 Sep 13, 2023
@jdddog jdddog deleted the feature/json_custom_datetime_format branch October 20, 2023 03:49
jdddog added a commit that referenced this pull request Feb 25, 2024
Co-authored-by: keegansmith21 <keegan.r.s21@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants