[Upgrade Details] Ensure details report `UPG_WATCHING` for the entire time that the upgrade is being watched #3827

ycombinator · 2023-11-27T23:59:30Z

What does this PR do?

This PR fixes the upgrade details such that they are in the UPG_WATCHING state the entire time the Agent upgrade is being watched by the Upgrade Watcher.

Why is it important?

So the state of the upgrade is accurately reported.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~I have made corresponding changes to the documentation~~
~~I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
~~I have added an entry in ./changelog/fragments using the changelog tool~~ Bug was never released.
I have added an integration test or an E2E test

How to test this PR locally

Build Elastic Agent from this PR but give it a lower-than-current version number. This will be the starting (pre-upgrade) version of the Agent.
```
AGENT_PACKAGE_VERSION=8.11.0 EXTERNAL=true SNAPSHOT=true PLATFORMS=darwin/arm64 PACKAGES=targz mage package
```
Make a no-op commit. Without this the Agent upgrade will not succeed.
Build Elastic Agent again. This will be the target (post-upgrade) version of the Agent.
```
EXTERNAL=true SNAPSHOT=true PLATFORMS=darwin/arm64 PACKAGES=targz mage package
```
Install the starting version of the Agent.

Upgrade to the target version of the Agent.

sudo elastic-agent upgrade 8.12.0-SNAPSHOT --source-uri file:///Users/shaunak/development/github/elastic-agent/build/distributions --skip-verify

Check that the Agent status reports upgrade details with a state of UPG_WATCHING.

sudo elastic-agent status --output json | jq '.upgrade_details'
{
  "target_version": "8.12.0",
  "state": "UPG_WATCHING",
  "metadata": {}
}

Check the Agent logs and verify the upgrade details states are in order. In particular, make sure that chronologically, we see UPG_WATCHING after UPG_RESTARTING.

sudo grep -R -h --include=\*.ndjson UPG_ /Library/Elastic/Agent | jq -c -s 'sort_by(.["@timestamp"]) | .[]'
{"log.level":"info","@timestamp":"2023-11-28T11:13:45.530Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_REQUESTED","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:13:45.531Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_DOWNLOADING","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:13:45.729Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_EXTRACTING","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:13:52.308Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_REPLACING","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:13:52.313Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_REPLACING","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:13:52.320Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0-SNAPSHOT","state":"UPG_RESTARTING","metadata":{}},"ecs.version":"1.6.0"}
{"log.level":"info","@timestamp":"2023-11-28T11:14:00.374Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":499},"message":"updated upgrade details","log":{"source":"elastic-agent"},"upgrade_details":{"target_version":"8.12.0","state":"UPG_WATCHING","metadata":{}},"ecs.version":"1.6.0"}

Wait until the Upgrade Watcher has finished running.

pgrep -f 'elastic-agent watch' | wc -l    # should report 0 eventually

Check the Agent version and verify that it's the target version.

sudo elastic-agent version

Binary: 8.12.0-SNAPSHOT (build: 80bb6a61369c20e054478a73c6c866aadfcc52b1 at 2023-11-27 23:38:37 +0000 UTC)
Daemon: 8.12.0-SNAPSHOT (build: 80bb6a61369c20e054478a73c6c866aadfcc52b1 at 2023-11-27 23:38:37 +0000 UTC)

Cleanup: revert/remove the no-op commit from step 2.

Related issues

Fixes The UPG_WATCHING state is not reported while the new agent version is being watched #3816

elasticmachine · 2023-11-28T16:02:41Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz · 2023-11-28T21:15:17Z

internal/pkg/agent/application/upgrade/marker_watcher.go

+	// - the marker was just created and the upgrade is about to start
+	//   (marker.details.state should not be empty), or
+	// - the upgrade was rolled back (marker.details.state should be UPG_ROLLBACK)


Is there a third option we haven't considered? The case where the agent is restarted in the middle of an upgrade, for example when it is in the UPGRADE_EXTRACTING state. The obvious example would be the host system powering off.

What happens in this case? What is reported in the upgrade details?

If we just rolled back, is the previous version of the agent guaranteed to be next to the currently running agent in the data path? That is is the state of the filesystem always something like:

data/ elastic-agent-current/ elastic-agent-next/

If this is true, can we look to see if there's another agent in the data directory next to us to detect if there was a roll back?

We will likely have both versions next to each other if the upgrade is interrupted immediately after the artifact is extracted, so that may not be 100% either.

One more thing, in the case that the host system powers off we should start calling https://pkg.go.dev/os#File.Sync on the marker file. It doesn't look like we do this today.

elastic-agent/internal/pkg/agent/application/upgrade/marker_access_other.go

Lines 26 to 30 in 97e8217

// On non-Windows platforms, writeMarkerFile simply writes the marker file.

// See marker_access_windows.go for behavior on Windows platforms.

func writeMarkerFile(markerFile string, markerBytes []byte) error {

return os.WriteFile(markerFilePath(), markerBytes, 0600)

}

This might only be worth doing arounds critical transitions, like when the agent process re-execs or the watcher first starts up.

One more thing, in the case that the host system powers off we should start calling https://pkg.go.dev/os#File.Sync on the marker file...

I've implemented the fsync change in its own PR, since it's not strictly related to this PR here: #3836

Is there a third option we haven't considered? The case where the agent is restarted in the middle of an upgrade, for example when it is in the UPGRADE_EXTRACTING state. The obvious example would be the host system powering off.

What happens in this case? What is reported in the upgrade details?

It depends on whether the state transition happens before or after the upgrade marker file comes into existence.

The following states occur before the upgrade marker file comes into existence and, as such, are never persisted in it: UPG_REQUESTED, UPG_SCHEDULED, UPG_DOWNLOADING, UPG_EXTRACTING. Additionally, the UPG_RESTARTING state is also currently not being persisted to the upgrade marker file, mostly because of where this state transition happens in the code vs. where the upgrade marker file is being created. With some refactoring, we could start persisting this state to the upgrade marker file as well. So if the Agent were to restart during one of these states, the upgrade details that are stored in the Coordinator state (and from there sent to Fleet) would get reset to nothing and the upgrade state would be lost.

The following states occur right before or after the upgrade marker file comes into existence and, as such, do get persisted to it: UPG_REPLACING, UPG_WATCHING, UPG_ROLLBACK. So if the Agent were to restart during one of these states, the upgrade details from the upgrade marker would be restored to the Coordinator state (and from there sent to Fleet).

We may want to consider either persisting upgrade details in their own file throughout the upgrade process OR creating the upgrade marker file at the start of the upgrade process instead of where it's being created now (right before the Upgrade Watcher is invoked from the old Agent).

If we just rolled back, is the previous version of the agent guaranteed to be next to the currently running agent in the data path? That is is the state of the filesystem always something like:

data/ elastic-agent-current/ elastic-agent-next/

If this is true, can we look to see if there's another agent in the data directory next to us to detect if there was a roll back?

Two folders will exist the moment we go past the UPG_EXTRACTING state and, yes, two folders will also exist when we are in the UPG_ROLLBACK state. So I'm not sure if checking if the number of folders > 1 is sufficient to determine if we're about to upgrade or if we've just rolled back.

But I think there might be another solution to detecting if we're about to upgrade: the code that creates the Upgrade Marker file runs in the same process as the code that's watching the Upgrade Marker file for changes. As such, the former code can communicate to the latter in memory that we're about to upgrade. In the case of a rollback, this communication will not happen.

Let me explore this solution in a separate PR as it's not strictly related to this PR here.

But I think there might be another solution to detecting if we're about to upgrade: the code that creates the Upgrade Marker file runs in the same process as the code that's watching the Upgrade Marker file for changes. As such, the former code can communicate to the latter in memory that we're about to upgrade. In the case of a rollback, this communication will not happen.

Let me explore this solution in a separate PR as it's not strictly related to this PR here.

#3837

The following states occur right before or after the upgrade marker file comes into existence and, as such, do get persisted to it: UPG_REPLACING, UPG_WATCHING, UPG_ROLLBACK. So if the Agent were to restart during one of these states, the upgrade details from the upgrade marker would be restored to the Coordinator state (and from there sent to Fleet).

What happens to the upgrade action if an upgrade is interrupted after it is started but before it is completed? Does the agent start over from the beginning? Does it acknowledge it as if the upgrade had happened if though it didn't? Does it never get acknowledged?

What does the watcher do if it starts up, sees an upgrade marker, but the version of the agent that is currently running isn't the version that should be running?

It would be surprising for a user to see an upgrade stuck in UPG_REPLACING for example. If the upgrade is essentially aborted by a host reboot then UPG_ROLLBACK could be considered the correct state. The UPG_WATCHING always clears itself when the watcher stops but I'm not sure what happens in this state in this situation today.

I think the ideal thing to happen in this situation is that the upgrade restarts from the beginning when the agent host system comes back online.

What happens to the upgrade action if an upgrade is interrupted after it is started but before it is completed? Does the agent start over from the beginning? Does it acknowledge it as if the upgrade had happened if though it didn't? Does it never get acknowledged?

Looking at the code, once the Upgrade Marker has been created, if the Agent restarts, it will acknowledge the upgrade with Fleet even if the upgrade may not have completed. In fact, it's entirely possible that Agent acknowledges the upgrade with Fleet while the Upgrade Watcher is still running and then the Upgrade Watcher decides to roll back the Agent. As far as Fleet is concerned, the upgrade would've been reported as successful, and then within 10 minutes, the previous version of Agent would start showing again without any explanation as to why.

With Upgrade Details, in the above rollback scenario, the Upgrade Watcher will record the state as UPG_ROLLBACK in the Upgrade Marker file, which will get picked up by the main Agent process and sent to Fleet.

What does the watcher do if it starts up, sees an upgrade marker, but the version of the agent that is currently running isn't the version that should be running?

Again, looking at the code...

First, if the Upgrade Watcher was running before an interruption stopped/killed it, and then the Upgrade Watcher was restarted, it will exit immediately because the watcher lock file, watcher.lock, will still exist. In this case, there are two possibilities as to which state will be reported in Upgrade Details to Fleet:

if the Upgrade Marker contains Upgrade Details, the state recorded in it will be reported to Fleet. This should be the UPG_REPLACING state as it's the last state that's persisted to the Upgrade Marker before the old Agent's upgrade code restarts the new Agent.

if, for some reason, the Upgrade Marker does not contain Upgrade Details and the version of the running Agent is the same as the previous version recorded in the Upgrade Marker, UPG_ROLLBACK will be reported to Fleet.

If the Upgrade Watcher wasn't running yet when the upgrade process was interrupted, the Upgrade Watcher will start monitoring the Agent regardless of what version of Agent is currently running or what version is recorded in the Upgrade Marker file. In this case, UPG_WATCHING will be reported to Fleet. If the watch succeeds, the Upgrade Details will stop being reported to Fleet; note that the version of Agent being reported to Fleet will already be the new one in this case. If the watch fails and Agent has to be rolled back, UPG_ROLLBACK will be reported to Fleet.

I think the ideal thing to happen in this situation is that the upgrade restarts from the beginning when the agent host system comes back online.

Agreed but I think restarting the upgrade from the beginning after a crash is beyond the scope of this PR so I've created #3860 to track this improvement.

Agree fixing this is outside the scope of this PR given your explanation.

I also think we acknowledge the upgrade too early since it isn't synchronized with the watcher but that is also out of scope.

AndersonQ

I tested it using the air-gapped test, it works. There is still what I believe is something similar to #3821, but it's because it's upgrading from an snapshot to anoter. Thus all good for this PR

pchila · 2023-11-29T20:15:12Z

testing/upgradetest/upgrader.go

+		// error context added by checkUpgradeDetailsState
+		return err
+	}
+


Maybe we want to check for a final state of the upgrade like COMPLETED or FAILED ?

We're checking this a few lines later in this same file: https://github.com/elastic/elastic-agent/pull/3827/files#diff-4941bb4aa7d421d6eb86b572bc58968a061897540ef25923d5b2460bfdce10e8R289-R295

pchila

Just a small comment about checking the final state of an upgrade before asserting that the upgrade details disappear

mergify · 2023-11-30T17:33:51Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b upgrade-details-fix-upg-watching upstream/upgrade-details-fix-upg-watching
git merge upstream/main
git push upstream upgrade-details-fix-upg-watching

mergify · 2023-12-01T19:12:37Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b upgrade-details-fix-upg-watching upstream/upgrade-details-fix-upg-watching
git merge upstream/main
git push upstream upgrade-details-fix-upg-watching

cmacknz

Tested manually and confirmed it works, thanks!

…tion

ycombinator · 2023-12-15T15:35:44Z

All upgrade-related integration tests are passing now. But all Endpoint-related integration tests are failing now with the same symptom:

Endpoint component or units are not healthy

[EDIT] Downloaded a failing test's diagnostic and looked at the Endpoint logs within it. The earliest errors in the logs say this:

{"@timestamp":"2023-12-15T13:45:00.559583771Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":3049,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:3049 HTTP code 401: Unauthorized","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.559604491Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":3057,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:3057 Message: {\"statusCode\":401,\"error\":\"ErrNoAuthHeader\",\"message\":\"no authorization header\"}","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.559618691Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":3088,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:3088 Failed to download artifact endpoint-hostisolationexceptionlist-linux-v1 - HTTP non-200 code received","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.560412691Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":728,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:728 Failed to initialize artifact, identifier: endpoint-hostisolationexceptionlist-linux-v1, reason: HTTP non-200 code received","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.560423771Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":1535,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:1535 All artifacts are being rejected because endpoint-hostisolationexceptionlist-linux-v1 is invalid","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.560434131Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"error","origin":{"file":{"line":1564,"name":"Artifacts.cpp"}}},"message":"Artifacts.cpp:1564 Failed to process artifact manifest","process":{"pid":26341,"thread":{"id":26377}}}

Right before those errors, there are these two logs about proxy URLs; not sure if those are relevant to the errors or not:

{"@timestamp":"2023-12-15T13:45:00.384293332Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"info","origin":{"file":{"line":179,"name":"Proxy.cpp"}}},"message":"Proxy.cpp:179  Global manifest override proxy URL: not set","process":{"pid":26341,"thread":{"id":26377}}}
{"@timestamp":"2023-12-15T13:45:00.384301412Z","agent":{"id":"13c4c478-4006-46dc-8cef-28489a279fc2","type":"endpoint"},"ecs":{"version":"1.11.0"},"log":{"level":"info","origin":{"file":{"line":179,"name":"Proxy.cpp"}}},"message":"Proxy.cpp:179  User manifest override proxy URL: not set","process":{"pid":26341,"thread":{"id":26377}}}

Failures don't seem related to the changes in this PR.

jlind23 · 2023-12-19T08:08:30Z

buildkite test this

elastic-sonarqube · 2023-12-19T08:59:17Z

Quality Gate passed

The SonarQube Quality Gate passed, but some issues were introduced.

1 New issue
0 Security Hotspots
55.6% 55.6% Coverage on New Code
0.0% 0.0% Duplication on New Code

See analysis details on SonarQube

… time that the upgrade is being watched (#3827) * Don't set upgrade details when creating upgrade marker * Set UPG_WATCHING state right before starting to watch upgrade * Log upgrade details whenever they're set on the coordinator * Fix logging location * Revert "Don't set upgrade details when creating upgrade marker" This reverts commit 6821832. * Fix logic with assuming UPG_ROLLBACK state * Add FIXME * Correctly observe upgrade details changes * Update unit test * Include upgrade details in status output * Check upgrade details state before and after upgrade watcher starts * Check that upgrade details have been cleared out upon successful upgrade * Update unit test * Fixing up upgrade integration tests * Add unit test + fix details object being used * Define AgentStatusOutput.IsZero() and use it * Make sure Marker Watcher accounts for `UPG_COMPLETED` state * Fix location of assertion * Fix error message * Join errors for wrapping * Debugging why TestStandaloneDowngradeToSpecificSnapshotBuild is failing * Cast string to details.State * Remove version override debugging * Wrap bugfix assertions in version checks * Introduce upgradetest.WithDisableUpgradeWatcherUpgradeDetailsCheck option * Call option function * Debugging * Fixing version check logic * Remove debugging statements (cherry picked from commit ad7e1b5)

… time that the upgrade is being watched (#3827) (#3927) * Don't set upgrade details when creating upgrade marker * Set UPG_WATCHING state right before starting to watch upgrade * Log upgrade details whenever they're set on the coordinator * Fix logging location * Revert "Don't set upgrade details when creating upgrade marker" This reverts commit 6821832. * Fix logic with assuming UPG_ROLLBACK state * Add FIXME * Correctly observe upgrade details changes * Update unit test * Include upgrade details in status output * Check upgrade details state before and after upgrade watcher starts * Check that upgrade details have been cleared out upon successful upgrade * Update unit test * Fixing up upgrade integration tests * Add unit test + fix details object being used * Define AgentStatusOutput.IsZero() and use it * Make sure Marker Watcher accounts for `UPG_COMPLETED` state * Fix location of assertion * Fix error message * Join errors for wrapping * Debugging why TestStandaloneDowngradeToSpecificSnapshotBuild is failing * Cast string to details.State * Remove version override debugging * Wrap bugfix assertions in version checks * Introduce upgradetest.WithDisableUpgradeWatcherUpgradeDetailsCheck option * Call option function * Debugging * Fixing version check logic * Remove debugging statements (cherry picked from commit ad7e1b5) Co-authored-by: Shaunak Kashyap <ycombinator@gmail.com>

ycombinator added Team:Elastic-Agent Label for the Agent team backport-skip skip-changelog labels Nov 27, 2023

mergify bot assigned ycombinator Nov 28, 2023

ycombinator force-pushed the upgrade-details-fix-upg-watching branch from 0ee25d4 to b1f5e51 Compare November 28, 2023 11:23

ycombinator marked this pull request as ready for review November 28, 2023 16:02

ycombinator requested a review from a team as a code owner November 28, 2023 16:02

ycombinator requested review from AndersonQ and pchila November 28, 2023 16:02

cmacknz reviewed Nov 28, 2023

View reviewed changes

This was referenced Nov 28, 2023

[Upgrade Details] For critical state transitions, fsync upgrade marker file #3836

Merged

[Upgrade Details] Inform the Upgrade Marker Watcher when an upgrade starts #3837

Merged

ycombinator requested a review from cmacknz November 29, 2023 00:31

AndersonQ approved these changes Nov 29, 2023

View reviewed changes

pchila mentioned this pull request Nov 29, 2023

Add agent hash to check for detecting upgrade rollbacks #3842

Merged

3 tasks

pchila linked an issue Nov 29, 2023 that may be closed by this pull request

Watcher does not consider build hash when determining if upgrade was rolled back #3821

Closed

pchila reviewed Nov 29, 2023

View reviewed changes

pchila approved these changes Nov 29, 2023

View reviewed changes

ycombinator force-pushed the upgrade-details-fix-upg-watching branch from 3c69999 to 74f2697 Compare November 29, 2023 21:54

pchila mentioned this pull request Nov 30, 2023

Watcher does not consider build hash when determining if upgrade was rolled back #3821

Closed

ycombinator force-pushed the upgrade-details-fix-upg-watching branch 2 times, most recently from bf1767d to 9925381 Compare December 1, 2023 13:57

cmacknz approved these changes Dec 4, 2023

View reviewed changes

ycombinator force-pushed the upgrade-details-fix-upg-watching branch from 49883a3 to 7344cd7 Compare December 5, 2023 18:58

cmacknz added backport-v8.12.0 Automated backport with mergify and removed backport-skip labels Dec 6, 2023

ycombinator force-pushed the upgrade-details-fix-upg-watching branch from 5cb41f4 to b040d29 Compare December 6, 2023 16:22

ycombinator added 9 commits December 14, 2023 09:51

Fixing up upgrade integration tests

f253fb5

Add unit test + fix details object being used

8cb22d7

Define AgentStatusOutput.IsZero() and use it

780af7d

Make sure Marker Watcher accounts for UPG_COMPLETED state

efc306c

Fix location of assertion

693ac25

Fix error message

c1cea89

Join errors for wrapping

6f19f66

Debugging why TestStandaloneDowngradeToSpecificSnapshotBuild is failing

b2adf45

Cast string to details.State

67cf265

ycombinator force-pushed the upgrade-details-fix-upg-watching branch from 05cb012 to 67cf265 Compare December 14, 2023 17:52

ycombinator added 6 commits December 14, 2023 11:32

Remove version override debugging

52aea7d

Wrap bugfix assertions in version checks

504cd77

Introduce upgradetest.WithDisableUpgradeWatcherUpgradeDetailsCheck op…

f316e76

…tion

Call option function

44abf06

Debugging

f142885

Fixing version check logic

a226cca

ycombinator enabled auto-merge (squash) December 15, 2023 02:02

ycombinator disabled auto-merge December 15, 2023 02:06

Remove debugging statements

c293f4e

ycombinator merged commit ad7e1b5 into elastic:main Dec 19, 2023

ycombinator deleted the upgrade-details-fix-upg-watching branch December 19, 2023 14:22

mergify bot mentioned this pull request Dec 19, 2023

[8.12](backport #3827) [Upgrade Details] Ensure details report UPG_WATCHING for the entire time that the upgrade is being watched #3927

Merged

ycombinator mentioned this pull request Dec 21, 2023

[8.12] Upgrades being incorrectly rolled back because the agent cannot parse the upgrade marker #3947

Closed

cmacknz mentioned this pull request Jan 5, 2024

Remove the OVERRIDE_AGENT_PACKAGE_VERSION value as 8.13 is now built #3901

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Upgrade Details] Ensure details report `UPG_WATCHING` for the entire time that the upgrade is being watched #3827

[Upgrade Details] Ensure details report `UPG_WATCHING` for the entire time that the upgrade is being watched #3827

ycombinator commented Nov 27, 2023 •

edited

Loading

elasticmachine commented Nov 28, 2023

cmacknz Nov 28, 2023

cmacknz Nov 28, 2023

cmacknz Nov 28, 2023

ycombinator Nov 28, 2023 •

edited

Loading

ycombinator Nov 28, 2023

ycombinator Nov 28, 2023

ycombinator Nov 29, 2023

cmacknz Nov 30, 2023

ycombinator Dec 4, 2023

cmacknz Dec 4, 2023

AndersonQ left a comment

pchila Nov 29, 2023

ycombinator Nov 29, 2023

pchila left a comment

mergify bot commented Nov 30, 2023

mergify bot commented Dec 1, 2023

cmacknz left a comment

ycombinator commented Dec 15, 2023 •

edited

Loading

jlind23 commented Dec 19, 2023

elastic-sonarqube bot commented Dec 19, 2023

	// On non-Windows platforms, writeMarkerFile simply writes the marker file.
	// See marker_access_windows.go for behavior on Windows platforms.
	func writeMarkerFile(markerFile string, markerBytes []byte) error {
	return os.WriteFile(markerFilePath(), markerBytes, 0600)
	}

[Upgrade Details] Ensure details report UPG_WATCHING for the entire time that the upgrade is being watched #3827

[Upgrade Details] Ensure details report UPG_WATCHING for the entire time that the upgrade is being watched #3827

Conversation

ycombinator commented Nov 27, 2023 • edited Loading

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

Related issues

elasticmachine commented Nov 28, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ycombinator Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndersonQ left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pchila left a comment

Choose a reason for hiding this comment

mergify bot commented Nov 30, 2023

mergify bot commented Dec 1, 2023

cmacknz left a comment

Choose a reason for hiding this comment

ycombinator commented Dec 15, 2023 • edited Loading

jlind23 commented Dec 19, 2023

elastic-sonarqube bot commented Dec 19, 2023

Quality Gate passed

[Upgrade Details] Ensure details report `UPG_WATCHING` for the entire time that the upgrade is being watched #3827

[Upgrade Details] Ensure details report `UPG_WATCHING` for the entire time that the upgrade is being watched #3827

ycombinator commented Nov 27, 2023 •

edited

Loading

ycombinator Nov 28, 2023 •

edited

Loading

ycombinator commented Dec 15, 2023 •

edited

Loading