Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup issue on recent macOS on M3 #29

Open
cdegroot opened this issue Dec 20, 2024 · 13 comments
Open

Startup issue on recent macOS on M3 #29

cdegroot opened this issue Dec 20, 2024 · 13 comments

Comments

@cdegroot
Copy link

cdegroot commented Dec 20, 2024

(Note, I don't have a mac myself, I still need to whip up a mini on AWS to reproduce, this is what colleagues have witnessed).

On macOS on a recent Macbook (thus on Apple's ARM chips), it seems like some of the ODPI structs have different sizes. The initialization check finds a discrepancy between the expected size for the OraData (48 bytes) and the actual size (64 bytes).

Given that OraDataBuffer does seem to pass the check, the only reasonable hypothesis is some extremely aggressive alignment on mac.

In any case, I mostly am putting this out there in case anyone else sees this and to figure out whether we're the first to use Oracle.jl on mac/ARM? It sounds unlikely...

(more to follow, but probably not until next year)

(.venv) ➜  src git:(eng-2485) ✗ julia oracle_example.jl
ERROR: LoadError: InitError: AssertionError: OraData should have sizeof 48 bytes. Found 64 bytes.
Stacktrace:
  [1] __init__()
    @ Oracle ~/.julia/packages/Oracle/4DDzj/src/Oracle.jl:45
@felipenoris
Copy link
Owner

I'm also running into this error.

@cjbj
Copy link

cjbj commented Jan 6, 2025

What compiler / options are being used to compile ODPI-C?

@felipenoris
Copy link
Owner

What compiler / options are being used to compile ODPI-C?

Please refer to

function build_shared_library(; verbose::Bool=false)
.

I also noticed this same error on Linux using Julia nightly.

@cjbj
Copy link

cjbj commented Jan 6, 2025

What gcc / clang versions?

Before getting too deep into this, can you update ODPI-C from 4.2.1 to the current release 5.4.1?

@sudarshan12s
Copy link

sudarshan12s commented Jan 7, 2025

Is using @REPR(C) in src/types.jl for only arm architecture feasable? something like (not verified though.. )


# Determine if the platform is ARM
const IS_ARM = Sys.MACHINE in ("aarch64", "arm")

# Define OraData conditionally
if IS_ARM
    @repr(C) struct OraData
        is_null::Int32        # 4 bytes
        value::OraDataBuffer  # 40 bytes
    end
else
    struct OraData
        is_null::Int32        # 4 bytes
        value::OraDataBuffer  # 40 bytes
    end
end

# Verify the size of OraData
println("Size of OraData: ", sizeof(OraData))
@assert sizeof(OraData) == 48

or Does it work with manual padding?:


struct OraData
    is_null::Int32          # 4 bytes
    padding::Int32          # 4 bytes, ensures 8-byte alignment
    value::OraDataBuffer    # 40 bytes
end

@cdegroot
Copy link
Author

cdegroot commented Jan 7, 2025

@cjbj 5.4.1 has the same issue, but if I prep a patch (have mac access now) I'll bump while we're at it. C compiler is default on macOS

Apple clang version 16.0.0 (clang-1600.0.26.6)
Target: arm64-apple-darwin24.2.0
Thread model: posix

@sudarshan12s I'll probably go with just changing the size expectation on aarch64 for now and then see what other structs have similar alignment optimizations. Where does @repr() come from by the way? I'm not a Julia specialist, can't find any references to it.

cdegroot added a commit to calmwave-open-source/Oracle.jl that referenced this issue Jan 7, 2025
I don't think this is safe: it will cause Julia to allocate too much
data when allocating for C (which is fine) but also to copy too much
data when copying from C (which is not fine, as that can cause an
overrun). However, the chance that stack or memory end on a not-64-byte
boundary on the 64 bit systems we're targeting is pretty much
nonexistent so it should work until a proper fix is found.

Related to felipenoris#29
cdegroot added a commit to calmwave-open-source/Oracle.jl that referenced this issue Jan 8, 2025
This is unsafe code, don't merge, a good chance that things will
crash because of this. Just trying out ideas here.

Related to felipenoris#29
@cdegroot
Copy link
Author

cdegroot commented Jan 8, 2025

Did some toying around and concluded that the size difference is real and it's dangerous to override the check. My reasoning:

  • If Julia allocates an OraData, it'll allocate too much. That's fine.
  • If Julia passes an OraData by value to C, it will write too much data (Julia thinks 64 bytes, C thinks 48 bytes), which is bad.
  • If Julia fetches an OraData by value from C, it will copy too much data. On 64 bit platforms, that poses a small risk (as segments are likely to be on 64 byte boundaries anyway) but it's not what you want.

So the only real solution is to define the Julia struct in a way that it matches the C struct, alas. No cheap fixes :). In Julia's internals, there's a MAX_ALIGN that seems to be in play here which is much larger on aarch64 than on x64. It's used in a lot of places so hard to figure out how this exactly works but that definition matches the observed difference.

@sudarshan12s
Copy link

@cjbj 5.4.1 has the same issue, but if I prep a patch (have mac access now) I'll bump while we're at it. C compiler is default on macOS

Apple clang version 16.0.0 (clang-1600.0.26.6)
Target: arm64-apple-darwin24.2.0
Thread model: posix

@sudarshan12s I'll probably go with just changing the size expectation on aarch64 for now and then see what other structs have similar alignment optimizations. Where does @repr() come from by the way? I'm not a Julia specialist, can't find any references to it.

My Mistake, I overlooked this. repr(C) seems to be for Rust . I am also not familiar with Julia, I was too trying to see if Julia has way to disable padding and enforce fixed layout or follow C layout with FFI as you mentioned.. Does this happen only with dpiData(having union) or with other structures too?

@cdegroot
Copy link
Author

cdegroot commented Jan 8, 2025

Seems to be only this one. And I haven't found an easy way around it yet :)

(enough I could hack, probably, but this smells like something's off in Julialand. I guess I'll have to venture out there and find a forum or something)

@cdegroot
Copy link
Author

cdegroot commented Jan 9, 2025

Asked on Julia Forum, https://discourse.julialang.org/t/unexpected-struct-size-on-aarch64-macos-m/124608

And answered there. The tl&dr is that primitive types and structs get different alignment, so OraDataBuffer needs to become a struct reflecting the size of the C-side union.

@cdegroot
Copy link
Author

cdegroot commented Jan 15, 2025

@felipenoris I'm still without macOS access but it looks like we've gotten to the bottom of this on the forum? I think I can brew up a PR that changes the primitive type in a struct and maybe (not sure how) one that makes the macOS build uses aarch64 binaries/compilation, but I can't really locallly test it. Or do you have other ideas?

@giordano
Copy link

I'm still without macOS access

https://github.com/mxschmitt/action-tmate

@cdegroot
Copy link
Author

https://github.com/mxschmitt/action-tmate

Heh, nice work-around. I asked work for a Mac Mini instead :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants