Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dummy inline arrays in asdf.util and weldxfile #469

Merged
merged 25 commits into from
Aug 20, 2021

Conversation

marscher
Copy link
Contributor

@marscher marscher commented Aug 6, 2021

Changes

  • The dummy block manager impl was replaced with a much simpler patching of the block creating function inside asdf. The replaced function does not allocate a block, but replaces the array with an empty list. With the additional asdf write option all_array_storage="internal" we achieve zero additional output memory for array serialization. This is being checked with a test.
  • The display_header function takes a (deep) copy of the Asdf tree, but without copying numpy arrays. This approach avoids side effects due manipulating the tree internally.
  • Added pytest options to use WeldxFile inside the read/write_buffer functions used in the testsuite. This increases the testing coverage of the class. Added a GH actions job to check this new configuration on Linux, py3.8 soley.

Checks

  • updated CHANGELOG.md
  • updated tests

@codecov
Copy link

codecov bot commented Aug 6, 2021

Codecov Report

Merging #469 (042a436) into master (afb3340) will increase coverage by 0.01%.
The diff coverage is 91.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #469      +/-   ##
==========================================
+ Coverage   96.84%   96.86%   +0.01%     
==========================================
  Files          89       89              
  Lines        5550     5547       -3     
==========================================
- Hits         5375     5373       -2     
+ Misses        175      174       -1     
Impacted Files Coverage Δ
weldx/util.py 99.18% <ø> (+0.81%) ⬆️
weldx/asdf/util.py 84.88% <87.50%> (+0.05%) ⬆️
weldx/asdf/file.py 97.43% <100.00%> (+0.54%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update afb3340...042a436. Read the comment docs.

@marscher
Copy link
Contributor Author

marscher commented Aug 6, 2021

osx is still fucked up (most likely because it is a fucked up os). Linux and Windows do no need more than 10% memory now during serialization. I've patched the copy behavior of ndarrays, because asdf always deepcopies the tree.

@CagtayFabry
Copy link
Member

nice, really liking this solution
can we skip this specific test only for osx?

@vhirtham
Copy link
Collaborator

vhirtham commented Aug 9, 2021

osx is still fucked up (most likely because it is a fucked up os). Linux and Windows do no need more than 10% memory now during serialization. I've patched the copy behavior of ndarrays, because asdf always deepcopies the tree.

        before = get_mem_info()
        fn = tempfile.mktemp(suffix=".wx", dir=tmpdir)
        with WeldxFile(mode=mode) as fh:
            fh["x"] = large_array
            fh.show_asdf_header(use_widgets=False, _interactive=False)
            fh.write_to(fn)

        after = get_mem_info()

Didn't really try to understand the whole test, but can't we move the second line out of the memcheck block? If we only want to check the memory footprint of show_asdf_header, we should remove as many possibly allocating commands as possible. Don't know what tempfile does under the hood, but maybe the fail on osx is not caused by show_asdf_header but one of the other involved lines. Just roughly followed this one, so sorry if I missed something.

can we skip this specific test only for osx?

I would want to understand why it is failing before disabling it. Disabling a failing test just lets me ask: why do we have the test in the first place if it isn't that important.

@marscher
Copy link
Contributor Author

marscher commented Aug 9, 2021

@vhirtham tempfile checks for an unused filename in $TEMPDIR and returns it. But the writeto method could involve larger memory footprints (e.g. buffers are kept in the background for whatever reason). I followed your advice and measure only the memory during show_asdf_header. And you are right about the test philosophy :) Still, osx sucks!

@vhirtham
Copy link
Collaborator

vhirtham commented Aug 9, 2021

@vhirtham tempfile checks for an unused filename in $TEMPDIR and returns it. But the writeto method could involve larger memory footprints (e.g. buffers are kept in the background for whatever reason). I followed your advice and measure only the memory during show_asdf_header. And you are right about the test philosophy :) Still, osx sucks!

Glad to hear that it worked. 😉

@CagtayFabry CagtayFabry added file weldx file handling (WeldxFile) ASDF everything ASDF related (python + schemas) labels Aug 12, 2021
@CagtayFabry
Copy link
Member

In case there are still issues with osx I prefer to mark the test as xfail on that platform for now, see https://docs.pytest.org/en/latest/how-to/skipping.html#condition-parameter

@CagtayFabry
Copy link
Member

I have changed the non interactive header output to just show the yaml string again for testing @marscher e29052f

@marscher
Copy link
Contributor Author

I have changed the non interactive header output to just show the yaml string again for testing @marscher e29052f

I do not understand why pytest stdout capturing does not work for the print statement in asdf.info(). I also thought that the cluttering of the test output was quite annoying. But we should not change the implementation/desired output just because the test tool has a hick up, right?

@CagtayFabry
Copy link
Member

I have changed the non interactive header output to just show the yaml string again for testing @marscher e29052f

I do not understand why pytest stdout capturing does not work for the print statement in asdf.info(). I also thought that the cluttering of the test output was quite annoying. But we should not change the implementation/desired output just because the test tool has a hick up, right?

there were some (probably unrelated issues)
however looking into it I actually like the pure yaml output also in a console because it shows the tag definitions etc and all nested elements. In addition asdf.info seems to do some kind of tag resolving to classes which is not really needed and imo can complicate things
However since the change is very simple we can revert back anytime?

@CagtayFabry
Copy link
Member

On another note can you explain the idea behind the _USE_WELDX_FILE flag?

@marscher
Copy link
Contributor Author

On another note can you explain the idea behind the _USE_WELDX_FILE flag?

The debug utility functions read_buffer and write_buffer will use WeldxFile internally, if the flag is set. Since these util functions are used extensively in the test suite, it increases the testing coverage of WeldxFile. There are two pytest flags to toggle this flag. I've also added a single pytest job to the matrix (just on Linux) to run this configuration.

# Conflicts:
#	weldx/asdf/file.py
#	weldx/asdf/util.py
#	weldx/tests/asdf_tests/test_weldx_file.py
@pep8speaks
Copy link

pep8speaks commented Aug 19, 2021

Hello @marscher! Thanks for updating this PR.

Line 1:5: E999 SyntaxError: invalid syntax

Comment last updated at 2021-08-20 12:04:57 UTC

@marscher marscher marked this pull request as ready for review August 19, 2021 19:39
@marscher marscher requested a review from CagtayFabry August 19, 2021 20:00
@CagtayFabry CagtayFabry self-requested a review August 20, 2021 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASDF everything ASDF related (python + schemas) file weldx file handling (WeldxFile)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants