-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support COFF/bigobj for large images on Windows. #52561
Comments
We discovered that Windows did not behave itself well, so we disabled it, especially as very little development is actually done directly on that platform |
That is a pity... I want to use ModelingToolkit in a class, and I have to suggest the students to create a system image because loading ModelingToolkit is still slow otherwise, even with Julia 1.10. It makes a difference if that takes 15 min or 30 min, though... |
I don't think we hard disable parallelism per-se, but ironically only when the system images are very large: Line 1454 in b57f8d1
x-ref: #50874 At the core of the issue is that COFF (the default binary format on Windows) uses 16bit for the number of symbols. |
How do you calculate what is "very large"? |
I think it is Line 1420 in 590a63b
|
Could we use the https://en.wikipedia.org/wiki/Portable_Executable format on Windows? |
PE is pretty much COFF, COFF was the unix name and PE is the microsoft name |
Maybe a version with 64-bit address space support called PE32+ ? |
It's going to be the same, the issue is that they use 16bit integer do identify stuff, and that has a 65k maximum value. People have complained to microsoft lots of times about this, but they haven't fixed this |
Or can we reduce the number of exported symbols? And why does the number of exported symbols depend on the number of threads used in the first place? |
When we use multiple threads we split the output into multiple files itself, but you still want calls across the split files to be fast. In single-threaded compilation we never make the symbols visible. |
This sounds to me as if we could use another file format for the split files... |
To my knowledge that is not feasible, but I would be happy to be proven wrong. |
Why not? Which program produces the intermediate files? And which program combines them? |
Perhaps you should ask Microsoft for a developer to help out? They must be interested that Windows users have a good experience when using Julia... |
Adding my 2c here...
LLVM, specifically the part of LLVM that compiles Windows native binaries. The intermediate files are .o files, similar to what you'd get from compiling a .c/.cpp file on Windows.
Usually for system images this will be your system's
Over time many different projects have run into this (LLVM, rust) and so far I have not seen evidence that Microsoft is interested in/capable of fixing it (evidence from 2019). |
Thanks Prem for the links. To summarize my understanding Microsoft added |
LLVM tools handle COFF+ bigobj perfectly, you don't even need to feed them with this option (e.g. clang automatically determines if it needs to generate bigobj file). PE number of exported symbols limit is completely different story, it isn't related to bigobj story and, I believe, will never be fixed by MS. Thus, to overcome this limit you shall invent your own exporting (and loading) scheme. E.g., I had developed such a scheme in my GHC on native Windows SDK project. I don't know which problem you face here, but if it is the former (bigobj) it should be easily fixable. |
No I think you are right I was looking at this yesterday and bigobj is related to sections and we are facing number of symbols here. |
One option we could try is the symbol hiding llvm does for itself now when building itself with clang+mingw |
This is the original error:
We should already be doing the symbol hiding, but I am also unsure if this occurs with package images or only in the context of package compiler. |
I guess this is a mistake. You absolutely don't need to They only need to to be visible globals (if I understand your intentions correctly). IIUC, probably this is wrong. |
They're not dllexport, only extern hidden (global aliases are a rarer case that we do need to dllexport for language reasons). We mark the symbols with hidden visibility on line 838, and we also confirmed that they're hidden in the final shared object on Linux. |
If they aren't dllexport the linker shouldn't try to export them then, unless it is invoked with |
MWE:
Check out from git:
Build the system image:
Now look into the task manager and check the CPU load. My CPU load was mostly at 7%, sometimes a bit higher,
but 8 threads where not used when creating the system image.
Doing the same on Linux is much faster, and the CPU load reaches 800% after about 2 minutes of the progress
of "building system image".
I used
juliaup
to run Julia. The machine had 24GB of RAM, never more than 50% where in use.Summary: Building a system image is slow on Windows and fast on Linux because on Windows Julia 1.10-rc2 uses only one thread.
Is this a known limitation?
How it should work is documented:
JULIA_IMAGE_THREADS
An unsigned 32-bit integer that sets the number of threads used by image compilation in this Julia process. The value of this variable may be ignored if the module is a small module. If left unspecified, the smaller of the value of JULIA_CPU_THREADS or half the number of logical CPU cores is used in its place.
The text was updated successfully, but these errors were encountered: