Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timx 287 ead field method refactor 3 #190

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

jonavellecuerdo
Copy link
Contributor

@jonavellecuerdo jonavellecuerdo commented Jun 3, 2024

Purpose and background context

Field method refactor for transform class Ead (Part 3).

  • Add field methods and corresponding unit tests for: languages, locations, notes,
    physical description, publishers, and summary

How can a reviewer manually see the effects of these changes?

  1. Run make test and verify all unit tests are passing.

  2. Run CLI command

    pipenv run transform -i tests/fixtures/ead/ead_record_all_fields.xml -o output/ead-transformed-records.json -s aspace
    

    Output:

    2024-06-03 12:17:26,897 INFO transmogrifier.cli.main(): Logger 'root' configured with level=INFO
    2024-06-03 12:17:26,897 INFO transmogrifier.cli.main(): No Sentry DSN found, exceptions will not be sent to Sentry
    2024-06-03 12:17:26,897 INFO transmogrifier.cli.main(): Running transform for source aspace
    2024-06-03 12:17:26,914 WARNING transmogrifier.sources.transformer.get_valid_title(): Record repositories/2/resources/1 has multiple titles. Using the first title from the following titles found: ['Charles J. Connick Stained Glass Foundation Collection', 'Title 2', 'Title 3']
    2024-06-03 12:17:26,923 INFO transmogrifier.cli.main(): Completed transform, total records processed: 1, transformed records: 1, skipped records: 0, deleted records: 0
    2024-06-03 12:17:26,923 INFO transmogrifier.cli.main(): Total time to complete transform: 0:00:00.026022
    

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed and verified
  • New dependencies are appropriate or there were no changes

@jonavellecuerdo jonavellecuerdo self-assigned this Jun 3, 2024
@jonavellecuerdo jonavellecuerdo force-pushed the TIMX-287-ead-field-method-refactor-3 branch from a89ed45 to ec3b026 Compare June 3, 2024 15:01
@@ -31,7 +31,6 @@ def get_optional_fields(self, source_record: Tag) -> dict | None:

# <archdesc> and <did> elements are required when deriving optional fields
collection_description = self._get_collection_description(source_record)
collection_description_did = self._get_collection_description_did(source_record)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Methods have been created for all fields that retrieved data from <archdesc><did>, thus this line of code is not needed!

Comment on lines +461 to +472
languages = []
collection_description_did = cls._get_collection_description_did(source_record)
for langmaterial_element in collection_description_did.find_all(
"langmaterial", recursive=False
):
languages.extend(
[
language
for language_element in langmaterial_element.find_all("language")
if (language := cls.create_string_from_mixed_value(language_element))
]
)
Copy link
Contributor Author

@jonavellecuerdo jonavellecuerdo Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For field methods that involved nested for loops, the following pattern is applied:

  • instantiate list instance (languages = [])
  • external for loop statement
  • extend list instance to list comprehension of inner for loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the comment explaing this, as I may have passed over this syntax and approach more quickly otherwise. This is interesting: I'm not sure I've seen a walrus operator as part of a list comprehension.

I cooked this up to help me unpack it:

nums = list(range(0,10))

def even_num_strings(num):
    if num % 2 == 0:
        return str(num)

[
    even_num
    for num in nums
    if (even_num := even_num_strings(num))
]
# Out[7]: ['0', '2', '4', '6', '8']

I must say, I kind of like it! I think this post has a nice example where it can avoid duplicate API or otherwise expensive calls. I think it's a bit more complex than some lists and for loops, but I'm on board.

Comment on lines +499 to +506
for note_element in collection_description.find_all(
[
"bibliography",
"bioghist",
"scopecontent",
],
recursive=False,
):
subelement_tag = "bibref" if note_element.name == "bibliography" else "p"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that XPath + lxml may be useful in derivations like this one!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As dicussed on a quick call, I agree. And agree, that given the scope of this field method refactoring, probably not the time to make, or even fully explore, the switch.

But helpful to have these concrete examples if and when we do revisit that.

@jonavellecuerdo jonavellecuerdo marked this pull request as ready for review June 3, 2024 16:18
Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left a couple of conversational comments, but nothing blocking.

Comment on lines +461 to +472
languages = []
collection_description_did = cls._get_collection_description_did(source_record)
for langmaterial_element in collection_description_did.find_all(
"langmaterial", recursive=False
):
languages.extend(
[
language
for language_element in langmaterial_element.find_all("language")
if (language := cls.create_string_from_mixed_value(language_element))
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the comment explaing this, as I may have passed over this syntax and approach more quickly otherwise. This is interesting: I'm not sure I've seen a walrus operator as part of a list comprehension.

I cooked this up to help me unpack it:

nums = list(range(0,10))

def even_num_strings(num):
    if num % 2 == 0:
        return str(num)

[
    even_num
    for num in nums
    if (even_num := even_num_strings(num))
]
# Out[7]: ['0', '2', '4', '6', '8']

I must say, I kind of like it! I think this post has a nice example where it can avoid duplicate API or otherwise expensive calls. I think it's a bit more complex than some lists and for loops, but I'm on board.

Comment on lines +499 to +506
for note_element in collection_description.find_all(
[
"bibliography",
"bioghist",
"scopecontent",
],
recursive=False,
):
subelement_tag = "bibref" if note_element.name == "bibliography" else "p"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As dicussed on a quick call, I agree. And agree, that given the scope of this field method refactoring, probably not the time to make, or even fully explore, the switch.

But helpful to have these concrete examples if and when we do revisit that.

Copy link
Contributor

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, just a few suggestions!

@classmethod
def get_locations(cls, source_record: Tag) -> list[timdex.Location] | None:
locations = []
control_access_elements = cls._get_control_access(source_record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move cls._get_control_access(source_record) to for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved. See cc25014. I moved calls to private methods when they weren't followed by a .find/.findall.

tests/sources/xml/test_ead.py Show resolved Hide resolved
tests/sources/xml/test_ead.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work on this transform!

Why these changes are being introduced:
* These updates are required to implement the architecture described
in the following ADR: https://github.com/MITLibraries/transmogrifier/blob/main/docs/adrs/0005-field-methods.md

How this addresses that need:
* Add field methods and corresponding unit tests:
  languages, locations, notes, physical description,
  publishers, and summary

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/TIMX-287
@jonavellecuerdo jonavellecuerdo force-pushed the TIMX-287-ead-field-method-refactor-3 branch from cc25014 to c30a779 Compare June 5, 2024 18:27
@jonavellecuerdo jonavellecuerdo merged commit 79d1b35 into main Jun 5, 2024
5 checks passed
@jonavellecuerdo jonavellecuerdo deleted the TIMX-287-ead-field-method-refactor-3 branch June 5, 2024 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants