-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write CEP about virtual packages #103
base: main
Are you sure you want to change the base?
Conversation
cep-????.md
Outdated
- Python's `os.confstr("CS_GNU_LIBC_VERSION")` | ||
- `getconf GNU_LIBC_VERSION` | ||
- `ldd --version` (in GLIBC distros) | ||
- System's `libc.so` (in MUSL distros, location not standardized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think MUSL should be its own thing. It is versioned differently. Plus users presumably want to distinguish GLIBC built packages from MUSL ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the stdlib
infrastructure, we have the technical requirements in place to do something like:
c_stdlib: # [linux]
- sysroot # [linux]
- musl # [linux]
c_stdlib_version: # [linux]
- 2.17 # [linux]
- 1.2 # [linux]
zip_keys: # [linux]
- - c_stdlib # [linux]
- c_stdlib_version # [linux]
Whether we want to roll out a "conda-forge for musl" is of course a separate question.
For the purpose of this CEP, definitely __glibc
needs to stay musl-free IMO. For future-proofing, we could add __musl
, but no strong feelings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the GLIBC version in MUSL distros, not the MUSL version itself. Happy to adjust the wording though. See this block in the constructor SH templates for some more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having glibc in musl is certainly not the default, I actually didn't know this is supported - TIL. In that case I agree it makes sense to report the glibc that's found.
So my point was that we could also build musl-native libs, but I guess that's much less urgent if alpine-users can just install a glibc compat (assuming this works with our glibc-flavoured packages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the original comment ( conda/constructor#850 (comment) ), think that user is referring to gcompat
:
...a library which provides glibc-compatible APIs for use on musl libc systems.
While that can be useful, think we should not encode that in this CEP
It may make sense to have a CEP extending this for MUSL (perhaps including gcompat
) and this could be one detail we include there, but would suggest we leave it out of this version and save it as future work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that user is referring to gcompat:
I'm not sure the original report was about gcompat
, but about this alpine-focused glibc
version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The challenge is if we include this line (even as an example) future readers will look at this line and say "Hey that is how they said we should be supporting MUSL." Even if that is not what we meant. So taking this line out is safer than keep it IMO
The rest of the examples above seem fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakirkham: think that user is referring to
gcompat
:
Look at the alpine wiki link I posted:
[wiki] If you want to run glibc programs in Alpine Linux, there are a few ways of doing so.
gcompat is one of them, but there are others.
@jakirkham: future readers will look at this line and say "Hey that is how they said we should be supporting MUSL."
I don't think that's realistic; it's also easy to preempt - just say "the version of glibc, if present (musl-based distros like alpine only provide this for compatibility; their main C library is musl itself)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Axel think we had posted around the same time. Agree there can be other ways people try to handle this need
To loop back to Jaime's original point on how to word this, what if we just say the following
- System's `libc.so` (in MUSL distros, location not standardized) | |
- System's GNU `libc.so` (see specific disto for location) |
Note: Some distros put libc.so
in different places. So this is worth stating in its own right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded these paragraphs a bit including the intent of the suggestions, let me know what you think!
Co-authored-by: jakirkham <jakirkham@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jaime! 🙏
Like the improvements and additions
Took another review pass with comments below
cep-????.md
Outdated
- Generic: `x86_64_v1`, `x86_64_v2`, `x86_64_v3`, `x86_64_v4`, `arm`, `ppc`... | ||
- Apple: For M1, the value is `m1`. M2 and M3 are reported as `m2` and `m3`, respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure I understand this part. x86_64_v1
, m1
etc are also part of archspec and are also properly detected as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded this section a bit. Does that help? Otherwise I'll need more details to understand the question. Thanks!
This virtual package MUST be present when the system exhibits GPU drivers compatible with the CUDA runtimes. When available, the version value MUST be set to the oldest CUDA version supported by the detected drivers (i.e. the formatted value of `libcuda.cuDriverGetVersion()`), constrained to the first two components (major and minor) and formatted as `{major}.{minor}`. The build string MUST be `0`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuDriverGetVersion
returns the version of the installed driver, not the oldest CUDA version supported by the detected drivers. Due to backwards compatibility support newer drivers also support older versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this returns the CUDA version as seen by the driver. The docs are not super clear. There's also cudaRuntimeGetVersion
At least this is what conda/conda
uses.
@jakirkham could you help clarify this? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think the driver CUDA version might not be the "oldest CUDA version supported by the detected driver". E.g. If my driver is 12.6 it might also support older CUDA versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking offline to see if others have thoughts here
That said, here are some rough thoughts. Starting with some background
When the CUDA Toolkit is typically shipped (like in the installer), it ships with a full suite of things including the driver library. However in Conda, we ship pretty much everything else except the driver library. So with Conda, the user is on the hook for ensuring the driver library is installed, which would happen when they install the driver
So this raises a question. How does Conda decide what version of the driver is compatible with a particular CUDA Toolkit?
This is what __cuda
is solving. It is giving us the driver library's associated CUDA Toolkit version. IOW if that driver library was shipped in an installer, this is that installer's CUDA Toolkit version
So now when we try to install the CUDA Toolkit libraries, we can answer the question, will the driver be able to handle these?
Originally the answer was the CUDA Toolkit libraries we install could be no newer than the CUDA Toolkit the driver came from. We had checked this down to the major minor version (patch version was allowed to float). So if this driver check reported CUDA 10.1, we could only use CUDA Toolkit libraries from 10.1 or older. Eventually we relaxed this to major only for 11.2+ and 12
So really __cuda
's version is the upper bound on the libraries we can use
Sorry for the long winded answer. Though hopefully the additional details provide context. Perhaps this can be incorporated into the text above somehow (after some discussion and Q&A)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like this captures the idea above. Though please feel free to adjust further. Happy to look again after
This virtual package MUST be present when the system exhibits GPU drivers compatible with the CUDA runtimes. When available, the version value MUST be set to the oldest CUDA version supported by the detected drivers (i.e. the formatted value of `libcuda.cuDriverGetVersion()`), constrained to the first two components (major and minor) and formatted as `{major}.{minor}`. The build string MUST be `0`. | |
This virtual package MUST be present when the system exhibits NVIDIA GPU drivers compatible with the CUDA runtimes. | |
For systems without such support the virtual package MUST NOT be present. | |
When available, the version value MUST be set to the latest CUDA version supported by the detected drivers (i.e. | |
the formatted value of [`cuDriverGetVersion()`]( https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__VERSION.html )), | |
constrained to the first two components (major and minor) and formatted as `{major}.{minor}`. The build string MUST be `0`. |
cep-????.md
Outdated
|
||
The version MUST be overridable with the `CONDA_OVERRIDE_GLIBC` environment variable, if set to a non-empty value. | ||
|
||
If the GNU `libc` version could not be estimated (e.g. the tool is not running on Linux), the tool SHOULD provide a default value (e.g. `2.17`) and inform the user of that choice and its possible overrides; e.g. via `CONDA_OVERRIDE_GLIBC`, a CLI flag or a configuration file. The environment variable MUST be ignored when the target platform is not `linux~-*`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a difference between gnu libc not being present and the version not being parsable? I think conda should also be able to run on systems without glibc if packages dont require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. What would be the expected behaviour? Something like this?
- Not found:
__glibc
does not show up. - Not parsable:
__glibc=0
+ warning/advice to override.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworded this section a bit now so let's take another look.
cep-????.md
Outdated
|
||
#### `__win` | ||
|
||
This virtual package MUST be present when the target platform is `win-*`. The version MUST be set to the first three numeric components of the Windows build version, formatted as `{major}.{minor}.{build}`. If the version cannot be estimated (e.g. because the target platform does not match the native platform), the fallback value MUST be set to `0`. The build string MUST be `0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not currently the case for any installer (conda, mamba, pixi) right? I think all of them report __win=0=0
at the moment.
Im all for this change though!
Edit: mamba does seem to report the windows version properly:
> pixi exec mamba info
...
virtual packages : __win=10.0.26100=0
> pixi exec conda info
...
virtual packages : __win=0=0
> pixi info
...
Virtual packages: __win=0=0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly! This is what triggered the CEP: we will need to extend what mamba does to the full ecosystem soon due to some Windows deprecations in the boost library. See referenced issues in the text for more details.
Co-authored-by: jakirkham <jakirkham@gmail.com>
I reworked some sections:
|
cep-????.md
Outdated
|
||
The build string MUST reflect one of: | ||
|
||
- The detected CPU microarchitecture when the target platform matches the native platform, as specified in the [`archspec/archspec-json` database](https://github.com/archspec/archspec-json/blob/v0.2.5/cpu/microarchitectures.json). For Apple Silicon, the reported values MUST correspond to the `vendor == Apple` entries; e.g. `m1`, `m2`, etc. For any other vendors, the reported values MUST correspond to the `vendor == generic` entries; e.g. `x86_64_v1`, `x86_64_v2`, `x86_64_v3`, `x86_64_v4`, `arm`, `ppc`, etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should it correspond to vendor == generic
? It's not the current behaviour in conda nor is it desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's so far what I had observed in my Windows cloud machine, but maybe that's because archspec
could not identify the actual CPU?
Reading the used function, that might be indeed the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just say "whatever archspec.cpu.detect.host()
does"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you find the latest edits acceptable 🙏
|
||
> The macOS version can be obtained via: | ||
> | ||
> - Python's `platform.mac_ver()[0]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will return the SYSTEM_VERSION_COMPAT version if the Python interpreter running the command was built against the 10.15 SDK or earlier.
See https://eclecticlight.co/2020/08/13/macos-version-numbering-isnt-so-simple/ for details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICR, if you start the Python interpreter with SYSTEM_VERSION_COMPAT=0 it returns the 11.x based version, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, running setting the environment variable will return a >=11 version.
My main concern here the "MUST NOT" language around the SYSTEM_VERSION_COMPAT workaround. I agree with the idea but this is not the case for the current version of conda (see conda/conda#13832). The example an this issue still reports a 10.16 version for __osx
. Changing this to "SHOULD" would be reasonable, especially given that there are many releases of tools/packages that will report the compatible version that already exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could consider that a bug in conda and submit a fix for 25.1 (as we are doing for __win). I can survey public repodata for __osx usage in the wild if that helps inform this decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- `__unix` | ||
- `__win` | ||
|
||
#### `__archspec` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The microarch-level packages in conda-forge depend on __archspec
to provide microarchitecture-level (e.g. x86-64-v2) meta-packages.
|
||
#### `__unix` | ||
|
||
This virtual package MUST be present when the target platform is `linux-*`, `osx-*` or `freebsd-*`. The version and build string fields MUST be set to `0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than list the specific target platforms now and having to adjust this CEP if/when new platforms are adopted, I would suggest language like "when the target platform is sufficiently POSIX-y" and list the attributes necessary for that to be true (e.g., uses /
for path delimiters, supports fork(3)
, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the sentiment, but I'm going to need some help or references to compile that list 😬
|
||
#### `__linux` | ||
|
||
This virtual package MUST be present when the target platform is `linux-*`. Its version value MUST be set to the Linux kernel version, constrained to the first two to four numeric components formatted as `{major}.{minor}.{micro}.{patch}`. If the version cannot be estimated (e.g. because the native platform is not Linux), the tool MUST set `version` to a fallback value of its choice. The build string MUST be `0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Linux kernel upstream only defines {major}.{minor}.{micro}
; anything beyond that (including the {patch}
component you wrote) is part of the distribution kernel's version string. I don't think conda tooling should expose those components since their semantics of those will vary from distribution to distributon; e.g., patch 42 on Fedora may differ from patch 42 on Ubuntu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this contradicts this comment in conda/conda
. I'm happy to trim to three fields, but we'll need a source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ha! Since Linux 3.0, "[mainline] kernels with 3-part versions are -stable kernels"; however, the mainline 2.6 series had 4-part version numbers. The existing "3- or 4-part" comment in conda/conda
comes from the fact that 2.6 is the upstream for the CentOS/RHEL 6 kernel, which was actively supported by Anaconda and conda-forge at the time I wrote that patch.
In light of that, I suggest replacing:
Its version value MUST be set to the Linux kernel version, constrained to the first two to four numeric components formatted as
{major}.{minor}.{micro}.{patch}
.
with something like:
Its version value MUST be set to the upstream (AKA mainline) Linux kernel version, but it MUST exclude any and all distribution-specific components of the kernel version.
I think that captures the intent of my previous comment, which was to ensure:
- we don't allow conda packages to depend on [Linux] distribution-specific kernels via the
__linux
virtual package; and - we don't have to deal with the various ways Linux distros version their kernels (which may or may not be compatible with how conda does version ordering)
(That said, my suggested language does allow for non-release/development/-rcN
kernels, but I suspect that user base is relatively small and able to cope if something goes horribly wrong.)
|
||
This virtual package MUST be present when the native and target platforms are both the same type of `linux-*` and GNU `libc` is installed in the system. The version value MUST be set to the system GNU `libc` version, constrained to the first two components (major and minor) formatted as `{major}.{minor}`. If the version cannot be estimated, the tool MUST set the version to a default value (e.g. `2.17`). | ||
|
||
If the native platform does not match the target platform, the tool MAY export `__glibc` with its `version` field set to a default value (e.g. `2.17`) of its choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we provide some guidance here on how the default value should be selected? If we don't, I could see situations arising where two different conda install tools or two different versions of the same install tool provide different default values on the same system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conda/conda
sets it to2.5
(yes, I know).- Pixi chose 2.28.
- Mamba skips it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rattler also skips the virtual package if a libc implementation cannot be found.
|
||
#### `__glibc` | ||
|
||
This virtual package MUST NOT be present if the target platform is not `linux-*`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...if the target platform's C standard library is not the GNU C Library". Not sure if we codify the assumption of all Linux systems inherently provide GNU libc
or that only Linux uses GNU libc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current conda implementation assumes the latter "only Linux uses GNU libc". See https://github.com/conda/conda/blob/e74a2b9d8a74837afc3bcbef609fe4bd29572e16/conda/plugins/virtual_packages/linux.py#L14-L16.
If that's incorrect, (1) we need to fix that in conda/conda
, and (2) I will update this paragraph.
- `__unix` | ||
- `__win` | ||
|
||
#### `__archspec` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we aware of any existing packages that actually depend on __archspec
?
If not, I would rather we not make this virtual package mandatory at this time, mostly because "what microarchitecture is this CPU?" is a question that can get complicated quickly; see, e.g., ARM big.LITTLE, Intel P/E cores, Intel Xeon 6, etc. To me, this virtual package is still (pseudo-)experimental, in the sense that we still need to work out how package maintainers should/want to use this package. IMO, __archspec
should be its own CEP or set of CEPs (cf. #59).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
#### `__archspec` | ||
|
||
This virtual package MUST be always present, with the version set to `1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im wondering why this must always be present. For some target platform it doesnt seem to make sense (any platform where the archspec should be reported as 0
basically). Shouldn't we instead just omit the archspec in that case?
|
||
- If the target platform matches the native platform, the best fitting CPU microarchitecture in the [`archspec/archspec-json` database](https://github.com/archspec/archspec-json/blob/v0.2.5/cpu/microarchitectures.json). The reference CPU detection implementation is [`archspec.cpu.detect.host()`](https://github.com/archspec/archspec/blob/v0.2.5/archspec/cpu/detect.py#L338). | ||
|
||
- The target platform architecture (second component of the platform string), mapped as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting observation from this table is that there are a number of entries here that are not present in the archspec database (armv6l, armv7l, s390x, arm64) and are thus not actually existing microarchitectures. I guess we can make something up. But for arm64
I think it would be more appropriate to use aarch64
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For rattler we have the following behavior:
win-arm64 -> aarch64
osx-arm64 -> m1
I think this makes sense because these are actual existing microarchitectures and they are also both the lowest possible supported microarchitectures on these platforms.
| `*-riscv64` | `riscv64` | | ||
| `*-s390x` | `s390x` | | ||
| `zos-z` | `0` | | ||
| Any other value | `0` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should riscv32
also be added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add wasm32
?
This virtual package MUST be present when the system exhibits GPU drivers compatible with the CUDA runtimes. When available, the version value MUST be set to the oldest CUDA version supported by the detected drivers (i.e. the formatted value of `libcuda.cuDriverGetVersion()`), constrained to the first two components (major and minor) and formatted as `{major}.{minor}`. The build string MUST be `0`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think the driver CUDA version might not be the "oldest CUDA version supported by the detected driver". E.g. If my driver is 12.6 it might also support older CUDA versions.
|
||
This virtual package MUST be present when the native and target platforms are both the same type of `linux-*` and GNU `libc` is installed in the system. The version value MUST be set to the system GNU `libc` version, constrained to the first two components (major and minor) formatted as `{major}.{minor}`. If the version cannot be estimated, the tool MUST set the version to a default value (e.g. `2.17`). | ||
|
||
If the native platform does not match the target platform, the tool MAY export `__glibc` with its `version` field set to a default value (e.g. `2.17`) of its choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rattler also skips the virtual package if a libc implementation cannot be found.
Motivated by this request: conda/conda#14443.