Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow large numbers of draw objects #526

Merged
merged 3 commits into from
Mar 20, 2024
Merged

Allow large numbers of draw objects #526

merged 3 commits into from
Mar 20, 2024

Conversation

raphlinus
Copy link
Contributor

Previously there was a limit of workgroup size squared for the number of draw objects, which is 64k in practice. This PR makes each workgroup iterate multiple blocks if that limit is exceeded, borrowing a technique from FidelityFX sort.

WIP, this causes hangs on mac. Uploading to test on other hardware.

Also contains some changes for testing that may not want to be committed as is.

Fixes #334

Previously there was a limit of workgroup size squared for the number of draw objects, which is 64k in practice. This PR makes each workgroup iterate multiple blocks if that limit is exceeded, borrowing a technique from FidelityFX sort.

WIP, this causes hangs on mac. Uploading to test on other hardware.

Also contains some changes for testing that may not want to be committed as is.

Fixes #334
Add barrier for write-after-read hazard in coarse. The loop in question processes 64k draw objects at a time, so the barrier only gets invoked when that limit is exceeded.

Also move new test scene so it isn't the first.
@raphlinus raphlinus marked this pull request as ready for review March 20, 2024 01:24
Copy link
Collaborator

@armansito armansito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

const N_WIDE: usize = 300;
const N_HIGH: usize = 300;
const SCENE_WIDTH: f64 = 2000.0;
const SCENE_HEIGHT: f64 = 1500.0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider assigning the dimensions to params.resolution (where params refers to the second parameter to this function). That will adjust the scene transform so that the contents fit inside the window.

for (var i = 0u; i < firstTrailingBit(WG_SIZE); i += 1u) {
var prefix = sh_scratch[0];

let num_blocks_total = (config.n_drawobj + (WG_SIZE - 1u)) / WG_SIZE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You could remove the parentheses around WG_SIZE - 1u.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically the reason for these parentheses was that I didn't trust constant evaluation to understand the associativity of addition. But I think it's better to trust the compiler here, and have the cleanest expression of programmer intent.

let n_blocks_base = num_blocks_total / WG_SIZE;
let remainder = num_blocks_total % WG_SIZE;
let first_block = n_blocks_base * wg_id.x + min(wg_id.x, remainder);
let n_blocks = n_blocks_base + u32(wg_id.x < remainder);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand lines 79-81 correctly, they distribute the remainder across workgroups to load balance? I would add a comment here that clarifies the block/workgroup/thread assignment.

I would also consider moving these calculations to a helper somewhere in shared/ so that they can be reused in shader/draw_reduce.wgsl (though I don't feel strongly about this).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally feel that for 5 lines of logic it's clearer to have them in place than to have to go find them, but realize this is a style issue.

Regarding documentation: I'm seriously thinking we need a wiki entry on prefix sum. I'm inclined to spend a couple hours or so on it. There's just too much lore to jam into a comment. That said, I've written three lines.

let ix = global_id.x;
let tag_word = read_draw_tag_from_scene(ix);
var agg = map_draw_tag(tag_word);
let num_blocks_total = (config.n_drawobj + (WG_SIZE - 1u)) / WG_SIZE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about explaining the block->thread assignment.

Set resolution in params for test scene. Add comments explaining division of work.
@raphlinus raphlinus added this pull request to the merge queue Mar 20, 2024
Merged via the queue into main with commit 959d313 Mar 20, 2024
15 checks passed
@DJMcNab DJMcNab deleted the big_draw2 branch March 20, 2024 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flicking problem
2 participants