Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky Test]: TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing – commits don't match #6272

Open
cmacknz opened this issue Dec 10, 2024 · 4 comments
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@cmacknz
Copy link
Member

cmacknz commented Dec 10, 2024

Failing test case

TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing

Error message

commits don't match: got 96f2b9f, want f4a7f9e

Build

https://buildkite.com/elastic/elastic-agent-extended-testing/builds/5203#0193ac62-c1e0-4b6b-9825-3c6321116e2c

OS

Linux, Windows

Stacktrace and notes

upgrader.go:347: waiting for upgrade watcher to start
    upgrader.go:352: upgrade watcher started
    upgrader.go:357: Checking upgrade details state while Upgrade Watcher is running
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe version --binary-only --yaml]
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: commits don't match: got 96f2b9fc0dbddfcaac1c55e0e0bc9aed3bf6badb, want f4a7f9e2b553228b44af9676c77d6e5dc7317045
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe status --output json]
    upgrader.go:526: waiting for healthy agent and proper version: could not unmarshal agent status output: unexpected end of JSON input
        context deadline exceeded
    upgrade_gpg_test.go:167: 
        	Error Trace:	C:/Users/windows/agent/testing/integration/upgrade_gpg_test.go:167
        	Error:      	Received unexpected error:
        	            	failed waiting for healthy agent and version (context deadline exceeded): could not unmarshal agent status output: unexpected end of JSON input
        	            	context deadline exceeded
        	Test:       	TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing
        	Messages:   	perform upgrade failed
    fixture_install.go:275: [test TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing] Inside fixture cleanup function
    fixture_install.go:291: collecting diagnostics; test failed
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe diagnostics -f C:\Users\windows\agent\build\diagnostics\TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing-2024-12-09T19-21-10Z-diagnostics.zip]
    fixture.go:663: >> running binary with: [C:\Program Files\Elastic\Agent\elastic-agent.exe uninstall --force]
    fixture.go:1031: Dumping running processes in C:\Users\windows\agent\build\diagnostics\TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing-2024-12-09T19-21-10Z-ProcessDump-cleanup.json
    fixture.go:1282: Temporary directory "C:\\Users\\windows\\AppData\\Local\\Temp\\TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing3726556639" preserved for investigation/debugging
    fixture.go:1282: Temporary directory "C:\\Users\\windows\\AppData\\Local\\Temp\\TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing1579482559" preserved for investigation/debugging
--- FAIL: TestStandaloneUpgradeWithGPGFallbackOneRemoteFailing (205.31s)
@cmacknz cmacknz added flaky-test Unstable or unreliable test cases. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Dec 10, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@cmacknz
Copy link
Member Author

cmacknz commented Dec 10, 2024

The test is not atomic in how it detects the end version we are upgrading to. It makes one call to the artifacts API to determine the latest snapshot version, then tells the agent to upgrade, and between those two steps a new snapshot can be created. This would work if it upgraded to a specific snapshot build ID instead of just -SNAPSHOT which is unconstrained.

https://github.com/elastic/elastic-agent/commits/8.17/

96f2b9f is the latest 8.17 commit which it what it upgraded to.

f4a7f9e is one before which is what it wants.

The test even has a comment about it having to download the real snapshot artifact (probably because it was signed by our GPG key):

err = upgradetest.PerformUpgrade(
ctx, startFixture, endFixture, t,
// passing "" as source URI is a hack to disable the --source-uri argument pointing at the endFixture srcPackage location
// this test needs the agent to download the real thing from artifacts.elastic.co so empty string.
// We need to download the same file from the same url and use that as end fixture
// or we need a way to disable the commit hash check (in this case the upgrade can be verified just with the
// version string)

@cmacknz
Copy link
Member Author

cmacknz commented Dec 10, 2024

It looks like we tried to fix this once before #4330

@ycombinator
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

3 participants