Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timx 282 dspacedim fmr judgment day #196

Merged
merged 2 commits into from
Jun 27, 2024

Conversation

ehanson8
Copy link
Contributor

Purpose and background context

Finish refactoring DspaceDim class to use field methods.

How can a reviewer manually see the effects of these changes?

Run the following command to see that the DspaceDim transform still transforms a source file:

pipenv run transform -i tests/fixtures/dspace/dspace_dim_records.xml -o output/dspace-transformed-records.json -s whoas

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed and verified
  • New dependencies are appropriate or there were no changes

ehanson8 added 2 commits June 18, 2024 10:27
Why these changes are being introduced:
* Finish refactoring DspaceDim to use field methods

How this addresses that need:
* Add field methods and associated private methods for funding_information, identifiers, languages, links, locations, notes, publishers, related_items, rights, subjects, and summary
* Add unit tests for new field methods

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/TIMX-282
Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, nice work. I left a couple of conversational comments, but nothing requested.

Comment on lines +653 to +658
<dim:field mdschema="dc" element="relation"
>A low resolution version of this movie was published.</dim:field>
<dim:field mdschema="dc" element="relation" qualifier="ispartofseries"
>International Association of Aquatic and Marine Science</dim:field>
<dim:field mdschema="dc" element="relation" qualifier="uri"
>https://doi.org/10.1002/2016JB013228</dim:field>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very minor nit, not a required or even suggested change, but I think personally I would find these kind of stubs easier to read if the elements were one line?

e.g.,

source_record = create_dspace_dim_source_record_stub(
    """
    <dim:field mdschema="dc" element="relation">A low resolution version of this movie was published.</dim:field>
    <dim:field mdschema="dc" element="relation" qualifier="ispartofseries">International Association of Aquatic and Marine Science</dim:field>
    <dim:field mdschema="dc" element="relation" qualifier="uri">https://doi.org/10.1002/2016JB013228</dim:field>
    """
)

I understand this breaks line length linting... but the readability might be worth it sometimes.

Again, not requesting a change -- particularly as it's probably fairly common across transforms -- but noting to propose that it's an option when we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can talk about that, not a big fan of sidescrolling, but not sure if I'm the only one so a later discussion is warranted!

Copy link
Contributor

@jonavellecuerdo jonavellecuerdo Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer side-scrolling 🤔. If the XML tags themselves are long, I would also prefer:

<tag> 
   [some long string]
</tag>

This is a constant struggle in my head while writing fixtures/tests for the field method refactor. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, yes, let's discuss further!

Comment on lines +239 to +248
return [
timdex.Identifier(
value=str(identifier.string),
kind=identifier.get("qualifier") or "Not specified",
)
for identifier in source_record.find_all(
"dim:field", element="identifier", string=True
)
if identifier.get("qualifier") != "citation"
] or None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this as an example, but I'm finding these easy to scan now. I do wonder if it's because I've seen quite a few now: list comprehension + filtering clause + or for some values. It's complex, but I think once you take the time to understand one, you kind of understand the rest in the transform.

When we get into orchestration, and all FMR is done, it might be worth looking at some of these and see if some helper utilities can be constructed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I agree that this pattern is ripe for further abstraction. Created an issue for this!

Copy link
Contributor

@jonavellecuerdo jonavellecuerdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@ehanson8 ehanson8 merged commit 3873233 into main Jun 27, 2024
4 checks passed
@ehanson8 ehanson8 deleted the TIMX-282-dspacedim-fmr-judgment-day branch June 27, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants