Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 2022 MSYS2 - add base-devel & compression groups #4245

Closed
wants to merge 1 commit into from

Conversation

MSP-Greg
Copy link
Contributor

@MSP-Greg MSP-Greg commented Oct 9, 2021

Description

Windows 2022 image removed MSYS2 mingw build tools, and also removed 'shared' MSYS2 tools. Adds base-devel & compression groups to the MSYS2 Install. Previously, no testing was done on these packages, which do add several exe files.

Check list

  • Related issue / work item is attached
  • Tests are written (if applicable)
  • Documentation is updated (if applicable)
  • Changes are tested and related VM images are successfully generated

@MSP-Greg
Copy link
Contributor Author

MSP-Greg commented Oct 9, 2021

Please consider installing the base-devel and compression groups to the MSYS2 installation on Windows-2022. Both were installed on previous images.

The base-devel group is shared by the separate build tool chains. For example, mingw or ucrt gcc won't run unless MSYS2 base-devel is installed. It contains many build tools that are platform/compiler agnostic. Without compression, there's no tar. Pretty common command for CI. I haven't checked for a while, but at one time the Windows tar did not support all the formats that MSYS2/*nix tar does.

I'm involved with Ruby, and am involved with the custom actions used by most Ruby repos for CI. Windows Ruby has been built using MSYS2 for several years (it previously used MSYS). A few Ruby repos run bash scripts, and need a bash shell somewhere, but the majority do not need a bash shell for CI testing, they need the bash shell for build utilities. Note that many Ruby repos may not require build tools for their specific code, but they may require builds tools for their direct dependencies or their CI testing dependencies.

With Windows-2019, gcc tools are available for mingw64 builds, and many repos use what's installed, since typically they are no more than 3 weeks old due to the update frequency of Actions images. Other repos (typically the actual Ruby repos) update the build tools and required packages (openssl, etc) so they are immediatedly aware of issues with new package releases.

Now today. Ruby is now being built with both mingw64 and ucrt64 tools. It's suggested that caching MSYS2 files be used, but we've found it much more efficient to have a separate cron repo running that packages all the mingw64 and ucrt64 tools and packages. This means it's stored in one repo (not every repo as with caching), and since the files are independent of the base MSYS2 install, there's no conflict with installing from a prebuilt zip file. It takes about ten seconds to install.

But, all of the build tool chains require all or part of the base-devel group to be installed. We have tried to create a prebuilt zip file, and it takes about 20 seconds to decompress. Also, some of its packages have dependencies that are shared by the existing MSYS install, so there could be issues with updates to those dependencies.

Summing up, without the base-devel and compression groups, many Ruby CI jobs (many only take a minute or two) will be slowed considerably and may be at risk of failure due to the above. I suspect that the same will be true of some of the other repos using MSYS2 build tools.

Off-topic

Obviously, most commercial Ruby applications run under *nix. But, many people learn coding on Windows, and in many parts of the world, everyone isn't running Win10 or Win11. Even then, jumping to WSL2/Ubuntu (or native *nix) is a big jump. Hence, I support Windows Ruby for the new users. I run Windows (and have done so since DOS days), but most of my coding is using WSL2/Ubuntu.

@miketimofeev
Copy link
Contributor

@eine how this will affect https://github.com/msys2/setup-msys2 action?

@eine
Copy link

eine commented Oct 13, 2021

@miketimofeev, this affects users who want to use the MSYS2 installation provided by GitHub and have up-to-date packages, because they will need to uninstall these and reinstall them. However, precisely because this problem exists, the default in setup-msys2 is to ignore the existing installation and provide a clean one. We are willing to change that for windows-2022, unless packages are reintroduced, as proposed in this PR.

In fact, this has been discussed so many times (https://github.com/actions/virtual-environments/pulls?q=is%3Apr+commenter%3Aeine+is%3Aclosed+sort%3Aupdated-desc and https://github.com/actions/virtual-environments/issues?q=is%3Aissue+commenter%3Aeine+is%3Aclosed), the last one being: #3652. However, @MSP-Greg does not agree with the current criteria, so every once in a while PRs such as this are proposed, the scope being the set of minimal packages required for building Ruby. The fact the he did not provide any reference to previous discussion, nor did he ping anyone, speaks for itself. Furthermore, he did not mention that the tools were removed on purpose from 2022 by maintainers of this repository (it was not an oversight). See #3949:

On Windows Server 2016 and 2019 we had a lot of pre-installed MSYS2 packages. We had a long discussion about "pre-installing them" vs "installing in runtime and caching". Our recommendation is using official setup-msys2 action to install packages on-flight and cache them.


It takes 20-60 seconds to have base-devel compression installed in a default environment which is not updated. Having those installed by default would avoid that minute for the users who are using MSYS2 for compiling something on a non up to date environment. At the same time, it would add a several minute penalty (up to 10 min) to all other users (anyone using an up to date environment, anyone using MSYS2 for executing tools but not for compiling anything, etc.). Taking the bigger scope into account, I do not recommend installing these packages. However, it is up to GitHub to prioritise certain developer flows over others.

Last, but not least, the best solution would be for GitHub to mirror MSYS2's package repo close to the location of the virtual environments, and have them updated in sync with regenerating the virtual environments. That would guarantee that users can rely on not updating the built-in MSYS2 installation, and still be able to install additional packages cleanly. In that context, the reason for setup-msys2 to handle realease: false and release: true would be stronger. false would imply using the GitHub installation and GitHub mirrors; while true would imply installing latest from MSYS2 and using latest mirrors.

@MSP-Greg
Copy link
Contributor Author

I'm not interested in discussing these issues again with @eine.

It takes 20-60 seconds to have base-devel compression

It just took 117.37 s to install them on Windows-2022

At the same time, it would add a several minute penalty (up to 10 min)

It just took 21.56 s to update them on Windows-2019

There are some people that believe you shouldn't be outside during a thunderstorm, as you might get struck be lightening. I've been working with MSYS2 for five or six years, been building Windows Ruby for almost as long (started with AppVeyor), helped with many Ruby gems building/compiling with OpenSSL, database packages, etc.

I didn't even mention whether Windows MSYS2 should be similar to Ubuntu or macOS shells, and without base-devel and compression they are not.

Building & testing Ruby itself takes a while on Windows, upwards of an hour when CI is busy. Many Ruby gems require compiling for the gem or its dependencies, but the CI build & test may take only a minute or two. Hence, if it takes two minutes to install the base script tools for building, it will increase the time a great deal. As mentioned above, we can 'pre-compile/compress' the toolchains (mingw64, ucrt64, etc) and install them in 10 or 20 seconds. We cannot do that with base-devel and compression.

base-devel and compression are relatively stable compared to many other packages/groups. The only updates needed were bison and p7zip. Bison is actually recently active, but most of the updates are bug fixes. p7zip had one update in January, the update before that was Apr-2017.

Lastly, how many Ubuntu CI jobs run apt update?

Wondering if @lazka has an opinion on this. Sorry for the ping, but...

@eine
Copy link

eine commented Oct 14, 2021

I'm not interested in discussing these issues again with @eine.

I agree. Please, refrain from bringing the same arguments again and again. I'm not arguing with you, but being forced to provide references because you avoid it, seemingly attempting to create confussion.

It just took 117.37 s to install them on Windows-2022

This was thoroughly discussed and proven already. See #3652 (comment). See also #3652 (comment), since we analysed it in both absolute and relative terms.

It just took 21.56 s to update them on Windows-2019

I'm pretty sure those 22s do not correspond to the use case I was talking about. Please, feel free to provide a reproducer that proves your point. I refer readers to #3652 (comment) for reproducible tests.

There are some people that believe you shouldn't be outside during a thunderstorm, as you might get struck be lightening. I've been working with MSYS2 for five or six years, been building Windows Ruby for almost as long (started with AppVeyor), helped with many Ruby gems building/compiling with OpenSSL, database packages, etc.

As mentioned several times before, this is not a fallacy show. None of those points are pertinent from a technical point of view.
Our background is irrelevant for proving a technical point.

I didn't even mention whether Windows MSYS2 should be similar to Ubuntu or macOS shells, and without base-devel and compression they are not.

I'm glad you did not, because that is off-topic.

Building & testing Ruby itself takes a while on Windows, upwards of an hour when CI is busy. Many Ruby gems require compiling for the gem or its dependencies, but the CI build & test may take only a minute or two. Hence, if it takes two minutes to install the base script tools for building, it will increase the time a great deal. As mentioned above, we can 'pre-compile/compress' the toolchains (mingw64, ucrt64, etc) and install them in 10 or 20 seconds. We cannot do that with base-devel and compression.

Precisely, all this comes down to you considering "a great deal" to have an at most 1 minute penalty in your own workflows, but you seem to be ok with thousands other users having an up to 10 minute penalty in theirs (just for the reason of satisfying your limited scope demands). I agree it might feel like a great deal to duplicate the runtime of a 1 minute job. However, I'm convinved it is a much greater deal for a majority of users to pay an up to 10x penalty. Some minimal penalty is unavoidable because MSYS2 is a rolling release. Maybe you should try a non-rolling solution that better matches your needs. See #3652 (comment).

Lastly, how many Ubuntu CI jobs run apt update?

Acording to GitHub's search feature, it might be around 300 000: https://github.com/search?l=YAML&q=apt-get+update&type=Code. Please, try to do your research before raising an argument.

@miketimofeev
Copy link
Contributor

Given the fact that some users will experience a 10+ minutes penalty, we would rather avoid merging these changes.

@MSP-Greg
Copy link
Contributor Author

As mentioned above, using windows-2022 with Ruby repos that require gcc tools, the additional time needed is between 45 and 60 seconds when CI is busy.

We're testing pre-assembling msys2, mingw64, and ucrt64 tools/files three times a day into 7z files, and extracting them on each job. They're stored in a repo release. That seems quickest, and removes the traffic from the MSYS2 servers, which would normally happen for each job.

@eine

Acording to GitHub's search feature, it might be around 300 000

Yes, you've previously pointed out your ability to use the GitHub search feature. Unfortunately, many of those results are for CI other than GitHub Actions. If one adds a constraint to the workflow folder, the number is ~ 54k. Not a high number.

in your own workflows

My workflows? A quick check of five or six popular Ruby repos/gems that require build tools, all show 'used by' numbers between 1M & 2.5M. Many of those 'used by' repos may not be running CI on Windows, but that's still a lot of repos...

@eine
Copy link

eine commented Nov 15, 2021

@MSP-Greg

As mentioned above, using windows-2022 with Ruby repos that require gcc tools, the additional time needed is between 45 and 60 seconds when CI is busy.

Precisely, it is at least 10 times less than 10+ minutes.

We're testing pre-assembling msys2, mingw64, and ucrt64 tools/files three times a day into 7z files, and extracting them on each job. They're stored in a repo release. That seems quickest, and removes the traffic from the MSYS2 servers, which would normally happen for each job.

Should you succeed, please propose enhancements to msys2/setup-msys2 or msys2.org/docs/ci (through msys2/msys2.github.io).

As you may know, there is a built-in feature in setup-msys2 for caching the whole installation: https://github.com/msys2/setup-msys2/blob/d07afaf11ae7963aa30c77d8e1f45a1be5f19954/main.js#L15-L16

It is currently disabled due to reliability issues. As explained in msys2/setup-msys2#61, GitHub's cache storage might be buggy.
However, that issue was created more than a year ago. Maybe GitHub/Microsoft improved the infrastructure since then. Feel free to fork and test.

If one adds a constraint to the workflow folder, the number is ~ 54k. Not a high number.

Fair enough. Still, 54k is an order of magnitude below 300k and four orders of magnitude above "five or six".
I'm not the one bringing the quantitative arguments with regard to the user bases; but just reflecting non-solid arguments.

My workflows? A quick check of five or six popular Ruby repos/gems that require build tools, all show 'used by' numbers between 1M & 2.5M. Many of those 'used by' repos may not be running CI on Windows, but that's still a lot of repos...

As explained several times already, if you want to bring data into the discussion, please do it properly.
Apart from GitHub's search feature, you can use the API for getting lots of data.
That would be very benefitial to stop circling around.
There is a Ruby toolkit: octokit/octokit.rb.
Unfortunately, vague and scope-limited claims are not very useful.
Particularly, as I guess you know, the "Used by" feature is unrelated to the usage of CI. You might want to constraint it to the ones using CI and Windows and the default installation of MSYS2; since that is the scope of this PR.

@miketimofeev miketimofeev mentioned this pull request Feb 2, 2022
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants