Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the change-manager API to determine file processing has completed #69

Merged
merged 1 commit into from
Jun 16, 2023

Conversation

mjgiarlo
Copy link
Member

@mjgiarlo mjgiarlo commented Jun 15, 2023

Why was this change made? 🤔

This commit updates our best understanding of how to navigate the Folio APIs to determine that a processFiles request has finished running. Adapted from sul-dlss/libsys-airflow#670

As part of this work, expand data import functionality to be able to handle multiple records.

How was this change tested? 🤨

CI + run against stage with a test script

# If called too quickly, might get Failure(:not_found)
data_importer.status
=> Failure(:pending)
data_importer.wait_until_complete
=> Success()
data_importer.instance_hrid
=> Success("in00000000010")
data_importer.instance_hrids
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple may be returned now!

# Failure(:not_found) if job is not found
def status
response_hash = client.get("/metadata-provider/jobSummary/#{job_execution_id}")
response_hash = client.get("/change-manager/jobExecutions/#{job_execution_id}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the API endpoint change.


return Failure(:error) if response_hash["totalErrors"].positive?
return Failure(:pending) if response_hash.dig("sourceRecordSummary", "totalCreatedEntities").zero? && response_hash.dig("sourceRecordSummary", "totalUpdatedEntities").zero?
return Failure(:pending) if !["COMMITTED", "ERROR"].include?(response_hash["status"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An "ERROR" approach means one or more records failed, but it does not mean they all fail. We probably need a more nuanced way to handle this eventually. Given there are zero consumers of this part of the Ruby gem, though, I was thinking it'd be fine to say "this is good enough for now."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put in a comment to this effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, done.

@@ -61,7 +59,7 @@ def default_wait_secs
end

def default_timeout_secs
5 * 60
10 * 60
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to bump this up to run a test script with a real MARC file containing 1.1K records.

@mjgiarlo mjgiarlo force-pushed the data-import-changes branch from b1e8f6d to 8bbd15e Compare June 16, 2023 20:51
@mjgiarlo mjgiarlo requested a review from justinlittman June 16, 2023 20:52
Copy link
Contributor

@justinlittman justinlittman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve with a comment for your consideration.


return Failure(:error) if response_hash["totalErrors"].positive?
return Failure(:pending) if response_hash.dig("sourceRecordSummary", "totalCreatedEntities").zero? && response_hash.dig("sourceRecordSummary", "totalUpdatedEntities").zero?
return Failure(:pending) if !["COMMITTED", "ERROR"].include?(response_hash["status"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put in a comment to this effect?

This commit updates our best understanding of how to navigate the Folio APIs to determine that a `processFiles` request has finished running. Adapted from sul-dlss/libsys-airflow#670

As part of this work, expand data import functionality to be able to handle multiple records.
@mjgiarlo mjgiarlo force-pushed the data-import-changes branch from 8bbd15e to 334d319 Compare June 16, 2023 21:48
@mjgiarlo mjgiarlo merged commit 63a6711 into main Jun 16, 2023
@mjgiarlo mjgiarlo deleted the data-import-changes branch June 16, 2023 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants