-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assertion failure on Android AVD on Apple M1 #30
Comments
Hi Znerole , thanks for the report! Can you please update me about the current state of Android Studio / SDK / NDK and emulator on Apple M1 ? @artwyman , do you know or remember what the magic number stands for, in the mentioned code line? |
Sure! M1 is supported only with a canary build of Android Studio for now: https://androidstudio.googleblog.com/2021/04/android-studio-arctic-fox-canary-15.html The accompanying emulator supports running arm64 system images, though it seems like only the Android 11 and 12 preview images are supported. The emulator is not very stable, the little time I've spent with it so far, I had frequent black screens and stuff like that, but it is possible to run and debug an application from within Android Studio. Except for the djinni assertion, my application (which is mostly C++) seemed to run fine, though I didn't test it heavily. I'm not sure how the emulator works. It is qemu based and it appears to be quite fast. The assertion left me a bit concerned if there are reserved values for the handle which might resolve to real memory addresses. |
I have no recollection of where the 4096 came from. It's unlike me to put magic numbers in code without explanation, but blame suggests this originated in a commit I authored 6 years ago: dropbox/djinni@d155107 My best guess is it was meant to catch uses of the wrong jlong value, such as a reference count, instead of the pointer intended here. Pointers are generally not close to zero, but the test of course totally omits the possibility that they could be interpreted as negative. I think it's probably fine to simply remove the assert at this point. Also, an Android emulator for an Apple chip is something I wouldn't have expected to see. :) |
This should close cross-language-cpp#30 but keep the exisiting check in place.
This should close #30 but keep the exisiting check in place.
@Znerole can you approve that this solved the issue for you on M1? |
Yes, works for me! Thank you very much! |
FWIW, having this fix was crashing the app on Galaxy A11 with Android 10 device. Emulating this device's memory/heap on Android Studio did not cause the crash, it was however, sporadically (but often) crashing on physical device. Reverting this change resolved the crash. We ended up removing this |
Thanks for your report @A-Mendi ! I think we also should remove that assert. |
@a4z - Yes we cherry-picked that change in our fork and that caused app to crash in a different codepath. I didn't investigate further into the cause of the crash as just removing this line seemed to have resolved the issue |
@A-Mendi, that seems to point a more profound problem than just this assertion. I also can not see how there could be a valid pointer in the range of [0..4095], assuming 4k is the memory page size for the given device. On iOS the page size is between 4k and 16k (https://developer.apple.com/library/archive/documentation/Performance/Conceptual/ManagingMemory/Articles/AboutMemory.html), for Android I think it can be assumed that it's 4k or bigger. |
That's a somehow a multifaceted topic. It would of course be interesting to debug that problem of A-Mendi, but that is obviously not possible. Since this is an assert, it can be tuned of by defining On the other hand, we have the situation that a pointer is casted to a number, passed via the API, and then back casted. I also think, if there would be a problem in A-Mendi's code, and the integral would not be a valid pointer, I guess an unpleasant exit would be the consequence anyway. So maybe there is an other problem (very likely) and this crash is just some indicator. However, we have here some kind of code smell, and the only solution I see is having a smarter type erasure that holds the type info internally to the integral and makes use of that. That will take some time until I find enough focus time to implement that, so if anyone wants doing that, please do. All the talks should give enough know how on the topic. But we might define that task in an extra ticket, will see, and think a bit more about that. So I leave this ticket a little bit more open as a reminder for that. |
The original hand-written bridging code I wrote before Djinni used a type-tagging approach like you describe, though it was in the first word of a struct not embedded in the jlong. It caught a lot of mistakes in my initial experiments. It wasn't included in the first generation of generated code since the risk of mistakes was so much lower. Personally I'd keep the |
Thanks Andrew! |
As discussed in cross-language-cpp#30, the magic number brought since some time some problems for some people. Please find the details in the discussion of cross-language-cpp#30, and why we switch to this assert now. Fixes cross-language-cpp#30, second version.
As it was raised in cross-language-cpp/djinni-generator#110, this issue might be the M1-based manifestation of missing support for ARM64 tagged pointers, which was introduced with Android 11: https://source.android.com/devices/tech/debug/tagged-pointers |
what do you mean with missing support for ARM64 tagged pointer,, @davidschreiber ? |
@a4z I think what @davidschreiber is trying to say is that pointer tagging could be the reason why pointer values on M1 exceeded the magic number limit in the first place. If this is indeed causing crashes on Android, I wonder if we are risking crashes on M1 as well? This is just me guessing, but maybe Apple is not (yet) enforcing top-byte ignore which is why the underlying problem was not yet discovered? |
The magic number does not exist anymore, so we should be save. We also do not have any crash on Android since we fixed this issue, afaik, Or do I still miss something? |
Hmm... @davidschreiber are the crashes that you reported in cross-language-cpp/djinni-generator#110 caused by the assert discussed in this issue or because Android detected a top byte modification and therefore terminated the process? |
Yes, the crashes are exactly the same, as we saw the ARM64 failures on the same assert. This failure is most likely due to the top bit being marked and interpreted as negative number, so performing a null check instead in this place seems like a good decision going forward. However, I don't know about the extend of pointer tagging, and whether there are other issues that might come from it. To reproduce this on Android, you need an Arm64 device running API 30, and and set the I'm going to see how fast we can make the transition from our current Djinni fork to this project, and we'll see if any other issues arise after doing so (we're developing https://pspdfkit.com for Android, and make extensive use of Djinni). |
If you need any help with the transition, let us know, we are happy to help. |
When using Djinni in the Android Emulator for Apple M1 (arm64), the assertion in djinni_support.hpp:323 fails.
This is a false positive (no pun intended). In this environment
handle
can be negative. I'm not quite sure where the 4096 comes from, but I would suggest to either remove the assertion, cast the (signed) jlong value to unsigned for this assertion or allow the value to be either negative or greater than 4096.I can confirm that my application runs perfectly fine when uncommenting this assertion.
The text was updated successfully, but these errors were encountered: