-
Notifications
You must be signed in to change notification settings - Fork 261
Graphene doesn't run under Valgrind #1919
Comments
Some explanations. Graphene has two parts: LibOS and PAL. (Below I'm talking about Linux PAL; Linux-SGX PAL is quite different.) PAL starts first (serves as a bootloader) -- in case of Linux (non-SGX) PAL, this is the Then PAL starts LibOS. LibOS has its own memory allocator. So LibOS must consult PAL as to "give me a huge contiguous memory region that I can use for my memory allocations". This region should be vast -- 100s of GBs -- because LibOS must serve huge applications like giant databases. Now this function So here we have a bad design assumption. LibOS assumes that the Linux-PAL executable is loaded at a high address. This design decision assumes that the Linux loader (or in this case the Valgrind loader) maps the main executable (which in this case is |
@pwmarcz Your proposed patch will break Graphene. What your patch forces Linux PAL to do: "Hey, LibOS, use Whereas in reality, the Linux PAL executable itself occupies the space So LibOS may overwrite the range 1 Actually, the upper limit is even more because it doesn't count the current 64MB of static arrays (used for initial memory management). |
Okay... so if I understand this, a correct implementation could go and look
for a biggest contiguous chunk of unmapped memory?
…On Tue, 27 Oct 2020, 09:34 Dmitrii Kuvaiskii, ***@***.***> wrote:
@pwmarcz <https://github.com/pwmarcz> Your proposed patch will break
Graphene.
What your patch forces Linux PAL to do: "Hey, LibOS, use 0x10000 ..
0x7fffffffffff as you wish, it is free memory".
Whereas in reality, the Linux PAL executable itself occupies the space 0x108000
.. 0x121000 (upper limit is approximate based on my quick look at the
size ll -h Pal/src/host/Linux/libpal.so). 1
So LibOS may overwrite the range 0x108000 .. 0x121000 (because it thinks
it is free), and then the PAL layer contains garbage. As soon as the
application on top of LibOS wants to do some syscall or smth and it goes
down to PAL code, we will see segfaults.
1 Actually, the upper limit is even more because it doesn't count the
current 64MB of static arrays (used for initial memory management).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1919 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADSMD3UWSUKMUGPOSAYM63SM2AYHANCNFSM4S72UHIQ>
.
|
I see two solutions.
|
Summoning @mkow @boryspoplawski @yamahata . |
And how do you feel about parsing |
We can also parse |
Small correction: it should be |
Imo ideal (seems doable and that was the plan iirc) solution would be to cede all memory management to LibOS. Pal would then mark some regions as used by it (early in the init, something like complement of @dimakuv option 1) and could request more memory if needed by some callbacks to LibOS. |
The ideal solution would be something like option 1 with twist. Pal also report the regions used for PAL. In early LibOS initialization, somehow LibOS should takes over of physical memory management. |
Okay, looks like it's not that simple... Even if the user address range is changed to some static area (I used As far as I can tell, this is because Python executable is not PIE, and that would suggest there's no good way of using Valgrind+Graphine with such executables except patching Valgrind. Am I right? |
Heh, that's sad. This means that without fixing this "Valgrind puts main executable at address 0x108000" issue, we cannot support any non-PIE executable bigger than |
But then how can Valgrind relocate a non-PIE python executable to |
I think the difference it that
The other way around: the main executalble is |
I guess 0x108000 is used as a default base only for PIE binaries, non-PIEs has to be loaded at their selected address. [edit] seems I raced with Paweł with the reply :) |
I added a comment to the bug above, since I guess we would be happy with a command-line override in Valgrind. Also, would it make sense create a non-PIE executable for Graphene to ensure it's loaded high, just for this purpose? Is that even possible? |
Can you create a non-PIE executable with hard-coded address that is not
|
After overriding the hardcoded address in Valgrind to a high enough pointer, some simple workloads work (I tried Python). But when I tried testing something more complicated (specifically Tensorflow), I ran into another bug: Running signal handler with alternate stack allocated on current stack crashes callgrind. The approach with non-PIE Graphene binary sounds promising, I'll try it out sometime. |
Signal handling and alternate stacks are not correctly implemented in Graphene. This is a well-known problem that was almost fixed by this PR: #1218. Parts of that PR were merged as other PRs, but the alternate stack thingy is still biting us. But it warms my heart that Graphene is not the only project that keeps postponing the fix of this issue :) |
Unfortunately, yes.
diff --git a/coregrind/m_ume/elf.c b/coregrind/m_ume/elf.c
index 21eb52bcb..07c961be6 100644
--- a/coregrind/m_ume/elf.c
+++ b/coregrind/m_ume/elf.c
@@ -586,7 +586,7 @@ Int VG_(load_ELF)(Int fd, const HChar* name, /*MOD*/ExeInfo* info)
ebase = 0x100000;
# else
vg_assert(VKI_PAGE_SIZE >= 4096); /* stay sane */
- ESZ(Addr) hacky_load_address = 0x100000 + 8 * VKI_PAGE_SIZE;
+ ESZ(Addr) hacky_load_address = 0x60000000000;
if (ebase < hacky_load_address)
ebase = hacky_load_address;
# endif
|
@pwmarcz: Is this still relevant? I remember that some recent changes might have fixed Valgrind. Also, will we actually want Valgrind support once we have UBSan+ASan? |
I think the issue with load address is still there? Anyway, yes, I don't think this is a promising avenue. Initially I wanted to use Callgrind for profiling, but perf works pretty well. Our UBSan and ASan integration will probably not cover everything Valgrind has to offer (we have no tracebacks, no stack instrumentation, etc.) but seem less invasive, and should be easier to expand than getting Valgrind to work. |
Thanks for the details! I'll close this issue then. |
Steps to reproduce
I was trying to profile the code using Callgrind.
DEBUG=1
LibOS/shim/test/regression
,make
valgrind --tool=callgrind
:pal-Linux
binary directly, not throughpal_loader
)Tested on current master (fbb2fca) and Ubuntu 18 (Valgrind 3.13.0)
Expected results
The program runs and exits with 0 status.
Actual results
The code fails initialization:
Investigation
Digging a bit deeper, it seems that
_DkGetAvailableUserAddressRange
returns a really small range of addresses (0x10000 to 0x108000), not enough to loadld-linux.so
into that range.This is because the code uses text section address (
TEXT_START
) as an end of the range. When running Graphene directly, this is a large pointer (somewhere below0x7fff_ffff_ffff
). However, Valgrind always loads the text section at a fixed address of0x108000
:There is a patch attached, but as far as I can tell, it doesn't necessarily help (it does not care if the address is high enough, only tries to find an area that is free).
Possible fix
Can the user address range (for Linux host) be determined in some other way? I don't know enough about either Graphene or Linux internals to understand if there is any harm in mapping memory above the text section.
Can we detect Valgrind's behaviour and do something different? For what it's worth, the following patch makes the above example (
bootstrap
) work with Valgrind:The text was updated successfully, but these errors were encountered: