You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently introduced a bug that broke the upgrade process for agents deployed in air gapped environments: #3368. One root cause for this is that we have no automated testing for air gapped deployments, largely because doing this for the entire stack in an automated manner would be challenging considering the automation itself needs network access to run the test.
While we likely can't create a true air gapped environment, we should be able to emulate one. The Elastic Agent only makes three outgoing network connections to known hosts, only one of which is likely to cause unique problems in an air gapped environment:
A connection to Fleet Server. This connection must exist in all deployments, and whether it is on an external network or internal network makes no difference to the Elastic Agent. For air gapped environments the majority of the difference is in the network setup and infrastructure.
A connection to the selected output, usually Elasticsearch. This connection must exist in all deployments, and is in the same category as Fleet Server above.
For the purposes of an air gapped upgrade, we only need to care about connections 3 and 4 which default to the public artifacts.elastic.co URL. We should be able to write a test that maintains network connectivity but blocks requests to this host from succeeding unless the connection in 3 is changed to a reachable alternative (ideally a local file server) and the failure from the connection in 4 is ignored as described in #3368.
The scope of this issue is to implement this test and prove that it adequately simulates an air gapped upgrade by verifying that the test fails until the binary download location is changed and specifically verifies that the fallback GPG URL request error is ignored.
The text was updated successfully, but these errors were encountered:
cmacknz
changed the title
Add integration test to emulate an air gapped agent upgrade
Add an integration test to emulate an air gapped agent upgrade
Sep 12, 2023
We should actually block *.elastic.co entirely and not just artifacts.elastic.co since the exact hostname varies depending on what we are doing. artifacts is the prefix for production releases, in tests it is likely to be snapshots.elastic.co for example.
We recently introduced a bug that broke the upgrade process for agents deployed in air gapped environments: #3368. One root cause for this is that we have no automated testing for air gapped deployments, largely because doing this for the entire stack in an automated manner would be challenging considering the automation itself needs network access to run the test.
While we likely can't create a true air gapped environment, we should be able to emulate one. The Elastic Agent only makes three outgoing network connections to known hosts, only one of which is likely to cause unique problems in an air gapped environment:
elastic-agent/internal/pkg/agent/application/upgrade/step_download.go
Line 33 in 0a19f36
For the purposes of an air gapped upgrade, we only need to care about connections 3 and 4 which default to the public artifacts.elastic.co URL. We should be able to write a test that maintains network connectivity but blocks requests to this host from succeeding unless the connection in 3 is changed to a reachable alternative (ideally a local file server) and the failure from the connection in 4 is ignored as described in #3368.
The scope of this issue is to implement this test and prove that it adequately simulates an air gapped upgrade by verifying that the test fails until the binary download location is changed and specifically verifies that the fallback GPG URL request error is ignored.
The text was updated successfully, but these errors were encountered: