-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
amd64: fixes memory.fill bug #1055
Conversation
Signed-off-by: Takeshi Yoneda <takeshi@tetrate.io>
I haven't identified the root cause, so assigning @evacchi ! |
i32.const 1000 | ||
i32.const 1000 | ||
i32.const 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like if these values exceed one byte range (>0xff) this happens, that's all I narrowed down!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting, the reason why the bug does not occur on aarch64 is less trivial than you may think.
Maybe the spec is a bit unclear, because memory.fill
takes 3 args:
dest
: destination addressvalue
: byte value to repeatwidth
: offset fromdst
if value > 0xff
then things start to get funky for us. But value
is the byte that should fill the memory from address dest
to offset width
. Let me stress here it is supposed to be a byte. If value
is anything but a byte it is unclear to me whether we should bitcast or trap (and I couldn't find references).
Proof
I have dumped the contents of the memory buffer at 999:2001; on M1, for value:=255
we get:
"\x00\xff\xff\xff\xff\xff\xff\ ... \xff\xff\x00"
"\x00\xff\xff\xff\xff\xff\xff\ ... \xff\xff\x00"
we get the same on amd64.
Now, let's set value:=256
on amd64
:
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00..."
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x01\x01\x01\x01\x01\x00\x01\x01\x01\x01\x01\x01\x01\x00\x01\x01\x01\x01\x01\x01\x01\x00..."
which is obviously incorrect. But what happens instead on M1?
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00..."
"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00..."
which is equal... because:
(1) we are casting to byte in the interpreter
value := byte(ce.popValue())
(2) on aarch64 STRB
takes (I think) the least significant byte from the 32-bit register
(3) on amd64, the bug is only triggered for width>15
because compileMemoryFill
"uses efficient REP STOSQ
instructions to copy in quadword (8 bytes) batches if the size if above 15 bytes"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, the fix is to ensure that value
is 1 byte before IMULQ
value to 0x0101010101010101
, which can be done by ANDQ 0xff value.register
Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
04ec992
to
ae94ccb
Compare
riscv tests? 🤔 |
}) | ||
|
||
// Checks if the memory state is the same between engines. | ||
require.Equal(t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this out-of-bounds is due to how run() works. it is called up to two times, depending on if platform.CompilerSupported()
so my guess is this test is only really going to do anything if platform.CompilerSupported()
as that's the only time you'll have two modules. maybe guard on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added at the top:
if !platform.CompilerSupported() {
return
}
Signed-off-by: Edoardo Vacchi <evacchi@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'll watch for comments from takeshi for my own learning purposes, but fixes the issue and doesn't break anything new!
excellent! |
This was found by the new fuzzing target in #1054