Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove uses of Arc::make_mut to avoid clones #2934

Closed
wants to merge 1 commit into from

Conversation

shssoichiro
Copy link
Collaborator

@shssoichiro shssoichiro commented May 9, 2022

Profiling revealed there to be a measurable number of clones coming out of TileStateMut::new, which was traceable to usage of Arc::make_mut. A characteristic of Arc::make_mut is that it will clone the inner data when a thread attempts to write to it. In theory, I have concerns that this could cause data inconsistencies, although that does not seem to have happened in practice in rav1e. However, the clones are avoidable, because each place where we use Arc::make_mut can be safely assumed to be independent--i.e. there is never more than one thread accessing the same portion of the data.

640x360 with 1 tile: 1.5% memory usage reduction
640x360 with 4 tiles: 4% memory usage reduction
1920x1080 with 4 tiles: 6% memory usage reduction

No measurable runtime improvement with 1 tile.
~1% runtime improvement with 4 tiles.

Cargo.toml Outdated Show resolved Hide resolved
@shssoichiro shssoichiro force-pushed the remove-make--mut branch 3 times, most recently from 13dc931 to 73c1d1a Compare May 13, 2022 18:17
@coveralls
Copy link
Collaborator

coveralls commented May 13, 2022

Coverage Status

Coverage increased (+0.05%) to 85.494% when pulling 24ec704 on shssoichiro:remove-make--mut into 295ad2e on xiph:master.

@shssoichiro shssoichiro force-pushed the remove-make--mut branch 2 times, most recently from 5b3bf7d to 7b2cdb0 Compare May 17, 2022 05:46
// SAFETY: We're not inside a tile, there's nothing else writing to the encoder config here.
unsafe {
// Increment the film grain seed for the next frame
if let Some(params) = arc_get_mut_unsafe(&mut fi.config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to move the film grain information to the FrameInvariants? Also who is accessing the config here?

fi,
);
// SAFETY: We know no other threads are accessing this because we are done with tiles.
// The only references would be in ref frames, which are not being touched at the moment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the gain from doing this? Maybe it should be moved to a different patch.

Copy link
Collaborator

@lu-zero lu-zero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs.frame_me_stats hits tile_iter_mut with a count of 2, this is in itself fishy or possibly a bug that went unnoticed since a while.

Ideally get_mut() should always work.

shssoichiro added a commit to shssoichiro/rav1e that referenced this pull request Jun 2, 2022
This partially addresses the comments in xiph#2934 by removing the
unnecessary field `coded_data.lookahead_me_stats`. This data is never
modified between the time it is created in
`compute_lookahead_motion_vectors` and the time it is used in
`compute_block_importances`, so there is no need to keep a separate copy
of it. Instead, we can use the main copy of the data in
`fs.frame_me_stats`.

Results in ~1% peak memory usage reduction.
shssoichiro added a commit to shssoichiro/rav1e that referenced this pull request Jun 2, 2022
This partially addresses the comments in xiph#2934 by removing the
unnecessary field `coded_data.lookahead_me_stats`. This data is never
modified between the time it is created in
`compute_lookahead_motion_vectors` and the time it is used in
`compute_block_importances`, so there is no need to keep a separate copy
of it. Instead, we can use the main copy of the data in
`fs.frame_me_stats`.

Results in ~1% peak memory usage reduction.
shssoichiro added a commit that referenced this pull request Jun 2, 2022
This partially addresses the comments in #2934 by removing the
unnecessary field `coded_data.lookahead_me_stats`. This data is never
modified between the time it is created in
`compute_lookahead_motion_vectors` and the time it is used in
`compute_block_importances`, so there is no need to keep a separate copy
of it. Instead, we can use the main copy of the data in
`fs.frame_me_stats`.

Results in ~1% peak memory usage reduction.
@lu-zero
Copy link
Collaborator

lu-zero commented Jun 3, 2022

https://gist.github.com/lu-zero/ffc7152ab0aabc651f0fde3da42b4559 this fixes most of the problems
Still fail:

    test_encode_decode::bitrate_dav1d
    api::test::guess_frame_subtypes_assert
    api::test::log_q_exp_overflow
    api::test::max_quantizer_bounds_correctly
    api::test::min_quantizer_bounds_correctly

@shssoichiro
Copy link
Collaborator Author

All the tests pass for me when I apply these changes against master. However, I still see the significant increase in memory usage, because this is basically the same as not having an Arc there--it's cloning the inner contents.

@lu-zero
Copy link
Collaborator

lu-zero commented Jun 4, 2022

You need 1 clone for the lookahead and 1 clone for the normal encoding in theory.

@shssoichiro shssoichiro closed this Jun 6, 2022
@shssoichiro
Copy link
Collaborator Author

Closed for #2959

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants