Improve performance of atomic load in toplevel code #47578

maleadt · 2022-11-15T11:50:34Z

As noted in #47561, the performance of toplevel code is pretty bad because of the atomic barrier (loading the global world age into the task-local one) that's emitted after every instruction now. @vtjnash noted that monotonic ordering is sufficient, which recovers some performance (1.5s -> 0.7s). But the biggest improvement comes from not emitting the atomic operation between every statement, but only when we perform a call.

Fixes #47561

vtjnash · 2022-11-15T14:28:17Z

It was somewhat intended. I wanted to move it before every CallInstr, instead of doing it after the calls or ccalls, so that it was synchronized with any other atomic barriers written by the user

maleadt · 2022-11-16T11:44:14Z

We could also move these costly atomic operations to only appear before CallInst or similar such instructions, where the cost is free

I implemented that suggestion, @vtjnash.

maleadt · 2022-11-17T12:10:23Z

Is the linux32 GC corruption known? I don't see how it could be related...

As this is a simple performance fix for an issue that could otherwise trip up people benchmarking code in the REPL (which seems like a common thing to do), I think it would be good to back-port this to 1.8
(if we're still going to do a release before 1.9). Feel free to remove the tag if anybody disagrees.

gbaraldi · 2022-11-17T12:22:24Z

It might be an OOM error showing up as something else. Does retrying fix it?

maleadt · 2022-11-17T14:46:34Z

No, it seems to happen every time...

maleadt · 2022-11-17T22:05:57Z

After the rebase, CI looks better.

vchuravy · 2022-11-19T00:55:54Z

The 32bit windows failure is still there, as well as on other PRs rebased ontop of this (as an example https://github.com/JuliaLang/julia/commits/vc/vtune)

This reverts commit 526cbf7.

maleadt added the performance Must go faster label Nov 15, 2022

maleadt force-pushed the tb/global_monotonic_barrier branch from 2277af7 to fe9f220 Compare November 16, 2022 11:43

maleadt mentioned this pull request Nov 16, 2022

Don't inline into toplevel code. #47576

Closed

vtjnash approved these changes Nov 16, 2022

View reviewed changes

maleadt force-pushed the tb/global_monotonic_barrier branch from fe9f220 to 59dc30f Compare November 17, 2022 09:27

maleadt added backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 labels Nov 17, 2022

maleadt added 3 commits November 17, 2022 09:47

Demote the atomic barrier inserted between global instns to monotonic.

65a54a3

Move the toplevel barrier emission back to emit_stmtpos.

cd35eab

Load the world age before every call instruction.

edcc9d2

DilumAluthge force-pushed the tb/global_monotonic_barrier branch from 59dc30f to edcc9d2 Compare November 17, 2022 14:47

maleadt merged commit 526cbf7 into master Nov 17, 2022

maleadt deleted the tb/global_monotonic_barrier branch November 17, 2022 22:07

vchuravy added a commit that referenced this pull request Nov 19, 2022

Revert "Improve performance of toplevel code (#47578)"

d7e2d11

This reverts commit 526cbf7.

vchuravy mentioned this pull request Nov 19, 2022

Revert "Improve performance of atomic load in toplevel code" #47635

Merged

maleadt pushed a commit that referenced this pull request Nov 19, 2022

Revert "Improve performance of toplevel code (#47578)" (#47635)

185b583

This reverts commit 526cbf7.

maleadt mentioned this pull request Nov 19, 2022

Reland: Improve performance of global code by emitting fewer atomic barriers. #47636

Merged

maleadt removed backport 1.8 Change should be backported to release-1.8 backport 1.9 Change should be backported to release-1.9 labels Nov 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of atomic load in toplevel code #47578

Improve performance of atomic load in toplevel code #47578

maleadt commented Nov 15, 2022 •

edited

Loading

vtjnash commented Nov 15, 2022

maleadt commented Nov 16, 2022

maleadt commented Nov 17, 2022

gbaraldi commented Nov 17, 2022

maleadt commented Nov 17, 2022

maleadt commented Nov 17, 2022

vchuravy commented Nov 19, 2022

Improve performance of atomic load in toplevel code #47578

Improve performance of atomic load in toplevel code #47578

Conversation

maleadt commented Nov 15, 2022 • edited Loading

vtjnash commented Nov 15, 2022

maleadt commented Nov 16, 2022

maleadt commented Nov 17, 2022

gbaraldi commented Nov 17, 2022

maleadt commented Nov 17, 2022

maleadt commented Nov 17, 2022

vchuravy commented Nov 19, 2022

maleadt commented Nov 15, 2022 •

edited

Loading