Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing with Build Candidates should fetch packages from staging instead of production #763

Closed
simitt opened this issue Nov 4, 2021 · 11 comments

Comments

@simitt
Copy link
Contributor

simitt commented Nov 4, 2021

When testing Build Candidates for the upcoming release on cloud, packages are fetched from the production instead of the staging package registry, even when testing on cloud staging. This prevents testing of BC for packages that are version aligned with stack releases. It requires to either test with daily snapshots instead or publish the upcoming package already weeks ahead of the stack release, which prevents further bug fixes in the packages through out Feature Freeze.

@ruflin
Copy link
Collaborator

ruflin commented Nov 5, 2021

I think this not only applies to Cloud but also on prem. As the build candidate is not having any -SNAPSHOT flag or similar inside it uses production. The difference on prem is that you can specify the registry url in Kibana. But having said that, I think in some Cloud environments on staging you should be able to do this too. Would this help?

@simitt
Copy link
Contributor Author

simitt commented Nov 5, 2021

To me this honestly looks like a bug. Why would a BC on cloud staging fetch the production package registry? Cloud production is probably grey area, but I would still argue that all unreleased builds should be fetching from staging.

The package promotion process is clearly defined, and as long as developers follow it, there won't be a difference between staging and production packages.
Not sure this issue sits well in this repository though, or better with the Fleet team?

@ruflin
Copy link
Collaborator

ruflin commented Nov 5, 2021

The logic which makes the decision on what registry is pulled can be found here: https://github.com/elastic/kibana/blob/ba367bca405487e8d1e6bc45a43ebac451d52e75/x-pack/plugins/fleet/server/services/epm/registry/registry_url.ts#L22 The challenge is that Kibana itself cannot tell the difference if it is a BC or release. That BC access production is also documented here: https://github.com/elastic/package-storage#package-storage That does not mean we should not change it.

Would you also expect that BC outside Cloud would access staging?

@mtojek
Copy link
Contributor

mtojek commented Nov 5, 2021

Not sure this issue sits well in this repository though, or better with the Fleet team?

I guess it's more an area belonging the @elastic/observablt-robots team?

When testing Build Candidates for the upcoming release on cloud, packages are fetched from the production instead of the staging package registry, even when testing on cloud staging.

... and it was working this way for some time in our observability test clusters? I don't know if it still is, but it's definitely configurable.

@simitt
Copy link
Contributor Author

simitt commented Nov 5, 2021

I personally would expect all BC to be fetched from staging, as they are not production released.

@mtojek this is also a concern for on prem and on ESS, out of the scope for the automation team.

Adding @joshdover and @jen-huang to the conversation.

@jsoriano
Copy link
Member

jsoriano commented Nov 5, 2021

Related discussion: elastic/package-spec#225

Current proposal there includes the option of releasing prerelease packages marked with semvers like x.x.x-rc1 in production registry. These would be the relevant points:

  • Completely remove the release labels. To avoid breaking changes they could be marked as optional in the spec, and eventually be ignored everywhere, package registry API could do best-effort interpretation for old experimental-related queries.
  • Pre-releases semver fields can be defined to release unstable versions. Packages stability (and supportability) would be determined only by their version, not from its origin. This implies the need of a change in a package to release it as stable, but this is ok, as this allow to introduce an approval step in the process.

This requires changes:

  • In the UX to somehow handle pre-release packages.
  • In the registry, to "emulate" the experimental searches in old versions of Kibana.

@joshdover
Copy link

I personally would expect all BC to be fetched from staging, as they are not production released.

BCs can be promoted directly to release builds. There is no difference between the final BC and the release that is published to customers. It's the same artifact byte-for-byte. So I don't think we even can do this. Personally, I also don't think it would be a good idea since BCs are intended to be exactly what we plan to ship.

To me this honestly looks like a bug. Why would a BC on cloud staging fetch the production package registry? Cloud production is probably grey area, but I would still argue that all unreleased builds should be fetching from staging.

If desired, we could configure Kibana instances on Cloud to use the staging registry, as mentioned above.

@tobio
Copy link
Member

tobio commented Feb 28, 2022

Is there a path forward for this issue? We seem to run into this issue at least once for each Stack release in Cloud which takes a bunch of time investigating why APM has suddenly stopped working only to find a path back to this issue.

@mtojek
Copy link
Contributor

mtojek commented Feb 28, 2022

This is not a bug in the Package Registry, I guess we can close it now. It's a cloud configuration issue.

To select packages either from staging or production, simply follow Josh's advice (docs here, epr-staging endpoint):

If desired, we could configure Kibana instances on Cloud to use the staging registry, as mentioned above.

@jsoriano
Copy link
Member

I guess that bundling may also help here? elastic/kibana#122297

I agree with closing this issue, as I don't think there is much we can do in the registry for this.

@joshdover
Copy link

joshdover commented Mar 7, 2022

Is there a path forward for this issue? We seem to run into this issue at least once for each Stack release in Cloud which takes a bunch of time investigating why APM has suddenly stopped working only to find a path back to this issue.

@tobio Bundling should help with this starting with 8.2 BCs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants