Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e test "TestBatchCreatePods" is Flaky #4086

Closed
wenyingd opened this issue Aug 8, 2022 · 6 comments · Fixed by #4104
Closed

e2e test "TestBatchCreatePods" is Flaky #4086

wenyingd opened this issue Aug 8, 2022 · 6 comments · Fixed by #4104
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@wenyingd
Copy link
Contributor

wenyingd commented Aug 8, 2022

Describe the bug

Hi,

I hit this failure when running e2e test on IPv6-only or dual-stack testbed. The error shows the fds are changed after batch creating Pods. I hit the failure several times, and the failure looks flaky because it may pass after I re-run the tests. I hit the error with my code change to switch OpenFlow version from 1.3 to 1.5, and I don't think the OpenFlow version change is the reason leads to the errors. So I create this issue to track it.

Below is the output of the error:

=== RUN   TestBatchCreatePods
    fixtures.go:222: Creating 'testbatchcreatepods-fkmovzrl' K8s Namespace
2022/08/08 05:50:08 Applying Antrea YAML
2022/08/08 05:50:10 Waiting for all Antrea DaemonSet Pods
2022/08/08 05:50:11 Checking CoreDNS deployment
    fixtures.go:488: Creating a test Pod 'test-pod-jxlkx8fn' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-h610hvfv' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-fnfl9kv5' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-tjry3jys' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-tq0rtbyi' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-1u9nli40' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-fz2uyy2g' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-8mnc7cny' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-z9f1fk8o' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-nl78xn29' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-do7lpfal' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-20ktb34d' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-rzxla9cn' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-ifusulmd' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-p6xgw3bh' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-glklzfjc' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-bil0ajic' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-9zu8mtr2' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-cfce87gt' and waiting for IP
    fixtures.go:488: Creating a test Pod 'test-pod-h0ea15rt' and waiting for IP
I0808 05:50:12.978261   20916 request.go:601] Waited for 1.187407918s due to client-side throttling, not priority and fairness, request: POST:https://[2620:124:6020:1006:250:56ff:fea7:a632]:6443/api/v1/namespaces/testbatchcreatepods-fkmovzrl/pods
I0808 05:50:23.178464   20916 request.go:601] Waited for 3.368735234s due to client-side throttling, not priority and fairness, request: GET:https://[2620:124:6020:1006:250:56ff:fea7:a632]:6443/api/v1/namespaces/testbatchcreatepods-fkmovzrl/pods/test-pod-fnfl9kv5
    batch_test.go:61: 
        	Error Trace:	batch_test.go:61
        	Error:      	Not equal: 
        	            	expected: "total 0\nlrwx------ 1 root root 64  0 -> /dev/null\nl-wx------ 1 root root 64  1 -> pipe:[642759189]\nlrwx------ 1 root root 64  10 -> socket:[642759983]\nlrwx------ 1 root root 64  11 -> socket:[642760818]\nlrwx------ 1 root root 64  12 -> socket:[642760821]\nlrwx------ 1 root root 64  13 -> socket:[642760831]\nlrwx------ 1 root root 64  14 -> socket:[642760089]\nlrwx------ 1 root root 64  15 -> socket:[642761833]\nlrwx------ 1 root root 64  16 -> socket:[642760364]\nlrwx------ 1 root root 64  18 -> socket:[642765137]\nl-wx------ 1 root root 64  2 -> pipe:[642759190]\nlrwx------ 1 root root 64  20 -> socket:[642760952]\nlrwx------ 1 root root 64  22 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.ERROR.20220808-053259.1\nlrwx------ 1 root root 64  24 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.WARNING.20220808-053259.1\nlrwx------ 1 root root 64  3 -> socket:[642757436]\nlrwx------ 1 root root 64  4 -> anon_inode:[eventpoll]\nlr-x------ 1 root root 64  5 -> pipe:[642757432]\nl-wx------ 1 root root 64  6 -> pipe:[642757432]\nlrwx------ 1 root root 64  7 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.INFO.20220808-053257.1\nlrwx------ 1 root root 64  8 -> socket:[642759806]\nlrwx------ 1 root root 64  9 -> socket:[642759560]\n"
        	            	actual  : "total 0\nlrwx------ 1 root root 64  0 -> /dev/null\nl-wx------ 1 root root 64  1 -> pipe:[642759189]\nlrwx------ 1 root root 64  10 -> socket:[642759983]\nlrwx------ 1 root root 64  11 -> socket:[642760818]\nlrwx------ 1 root root 64  12 -> socket:[642760821]\nlrwx------ 1 root root 64  13 -> socket:[642760831]\nlrwx------ 1 root root 64  14 -> socket:[642760089]\nlrwx------ 1 root root 64  15 -> socket:[642761833]\nlrwx------ 1 root root 64  16 -> socket:[642760364]\nlrwx------ 1 root root 64  17 -> socket:[643041057]\nlrwx------ 1 root root 64  18 -> socket:[642765137]\nlrwx------ 1 root root 64  19 -> socket:[643042759]\nl-wx------ 1 root root 64  2 -> pipe:[642759190]\nlrwx------ 1 root root 64  20 -> socket:[642760952]\nlrwx------ 1 root root 64  21 -> socket:[643052702]\nlrwx------ 1 root root 64  22 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.ERROR.20220808-053259.1\nlrwx------ 1 root root 64  23 -> socket:[643051982]\nlrwx------ 1 root root 64  24 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.WARNING.20220808-053259.1\nlrwx------ 1 root root 64  25 -> socket:[643052016]\nlrwx------ 1 root root 64  27 -> socket:[643051914]\nlrwx------ 1 root root 64  28 -> socket:[643052817]\nlrwx------ 1 root root 64  29 -> socket:[643052761]\nlrwx------ 1 root root 64  3 -> socket:[642757436]\nlrwx------ 1 root root 64  30 -> socket:[643053316]\nlrwx------ 1 root root 64  33 -> socket:[643047506]\nlrwx------ 1 root root 64  34 -> socket:[643046324]\nlrwx------ 1 root root 64  35 -> socket:[643047907]\nlrwx------ 1 root root 64  36 -> socket:[643048361]\nlrwx------ 1 root root 64  37 -> socket:[643047619]\nlrwx------ 1 root root 64  38 -> socket:[643047556]\nlrwx------ 1 root root 64  39 -> socket:[643047559]\nlrwx------ 1 root root 64  4 -> anon_inode:[eventpoll]\nlrwx------ 1 root root 64  40 -> socket:[643047681]\nlrwx------ 1 root root 64  41 -> socket:[643051214]\nlrwx------ 1 root root 64  42 -> socket:[643046154]\nlrwx------ 1 root root 64  43 -> socket:[643048385]\nlrwx------ 1 root root 64  44 -> socket:[643049082]\nlrwx------ 1 root root 64  45 -> socket:[643046312]\nlrwx------ 1 root root 64  46 -> socket:[643049100]\nlrwx------ 1 root root 64  47 -> socket:[643049107]\nlrwx------ 1 root root 64  48 -> socket:[643050042]\nlrwx------ 1 root root 64  49 -> socket:[643050097]\nlr-x------ 1 root root 64  5 -> pipe:[642757432]\nlrwx------ 1 root root 64  50 -> socket:[643050118]\nlrwx------ 1 root root 64  51 -> socket:[643048228]\nlrwx------ 1 root root 64  52 -> socket:[643052671]\nlrwx------ 1 root root 64  53 -> socket:[643046315]\nlrwx------ 1 root root 64  54 -> socket:[643047837]\nlrwx------ 1 root root 64  55 -> socket:[643046278]\nlrwx------ 1 root root 64  56 -> socket:[643047888]\nlrwx------ 1 root root 64  57 -> socket:[643048254]\nlrwx------ 1 root root 64  58 -> socket:[643051841]\nlrwx------ 1 root root 64  59 -> socket:[643052820]\nl-wx------ 1 root root 64  6 -> pipe:[642757432]\nlrwx------ 1 root root 64  60 -> socket:[643050854]\nlrwx------ 1 root root 64  61 -> socket:[643052877]\nlrwx------ 1 root root 64  62 -> socket:[643051747]\nlrwx------ 1 root root 64  63 -> socket:[643052725]\nlrwx------ 1 root root 64  64 -> socket:[643048281]\nlrwx------ 1 root root 64  65 -> socket:[643051163]\nlrwx------ 1 root root 64  66 -> socket:[643048347]\nlrwx------ 1 root root 64  67 -> socket:[643052735]\nlrwx------ 1 root root 64  68 -> socket:[643051781]\nlrwx------ 1 root root 64  69 -> socket:[643053274]\nlrwx------ 1 root root 64  7 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.INFO.20220808-053257.1\nlrwx------ 1 root root 64  70 -> socket:[643053333]\nlrwx------ 1 root root 64  72 -> socket:[643051806]\nlrwx------ 1 root root 64  73 -> socket:[643051819]\nlrwx------ 1 root root 64  74 -> socket:[643051945]\nlrwx------ 1 root root 64  75 -> socket:[643052898]\nlrwx------ 1 root root 64  8 -> socket:[642759806]\nlrwx------ 1 root root 64  87 -> socket:[643043096]\nlrwx------ 1 root root 64  9 -> socket:[642759560]\nlrwx------ 1 root root 64  91 -> socket:[643043204]\nlrwx------ 1 root root 64  92 -> socket:[643043257]\nlrwx------ 1 root root 64  94 -> socket:[643047479]\nlrwx------ 1 root root 64  95 -> socket:[643046322]\n"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -10,14 +10,70 @@
        	            	 lrwx------ 1 root root 64  16 -> socket:[642760364]
        	            	+lrwx------ 1 root root 64  17 -> socket:[643041057]
        	            	 lrwx------ 1 root root 64  18 -> socket:[642765137]
        	            	+lrwx------ 1 root root 64  19 -> socket:[643042759]
        	            	 l-wx------ 1 root root 64  2 -> pipe:[642759190]
        	            	 lrwx------ 1 root root 64  20 -> socket:[642760952]
        	            	+lrwx------ 1 root root 64  21 -> socket:[643052702]
        	            	 lrwx------ 1 root root 64  22 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.ERROR.20220808-053259.1
        	            	+lrwx------ 1 root root 64  23 -> socket:[643051982]
        	            	 lrwx------ 1 root root 64  24 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.WARNING.20220808-053259.1
        	            	+lrwx------ 1 root root 64  25 -> socket:[643052016]
        	            	+lrwx------ 1 root root 64  27 -> socket:[643051914]
        	            	+lrwx------ 1 root root 64  28 -> socket:[643052817]
        	            	+lrwx------ 1 root root 64  29 -> socket:[643052761]
        	            	 lrwx------ 1 root root 64  3 -> socket:[642757436]
        	            	+lrwx------ 1 root root 64  30 -> socket:[643053316]
        	            	+lrwx------ 1 root root 64  33 -> socket:[643047506]
        	            	+lrwx------ 1 root root 64  34 -> socket:[643046324]
        	            	+lrwx------ 1 root root 64  35 -> socket:[643047907]
        	            	+lrwx------ 1 root root 64  36 -> socket:[643048361]
        	            	+lrwx------ 1 root root 64  37 -> socket:[643047619]
        	            	+lrwx------ 1 root root 64  38 -> socket:[643047556]
        	            	+lrwx------ 1 root root 64  39 -> socket:[643047559]
        	            	 lrwx------ 1 root root 64  4 -> anon_inode:[eventpoll]
        	            	+lrwx------ 1 root root 64  40 -> socket:[643047681]
        	            	+lrwx------ 1 root root 64  41 -> socket:[643051214]
        	            	+lrwx------ 1 root root 64  42 -> socket:[643046154]
        	            	+lrwx------ 1 root root 64  43 -> socket:[643048385]
        	            	+lrwx------ 1 root root 64  44 -> socket:[643049082]
        	            	+lrwx------ 1 root root 64  45 -> socket:[643046312]
        	            	+lrwx------ 1 root root 64  46 -> socket:[643049100]
        	            	+lrwx------ 1 root root 64  47 -> socket:[643049107]
        	            	+lrwx------ 1 root root 64  48 -> socket:[643050042]
        	            	+lrwx------ 1 root root 64  49 -> socket:[643050097]
        	            	 lr-x------ 1 root root 64  5 -> pipe:[642757432]
        	            	+lrwx------ 1 root root 64  50 -> socket:[643050118]
        	            	+lrwx------ 1 root root 64  51 -> socket:[643048228]
        	            	+lrwx------ 1 root root 64  52 -> socket:[643052671]
        	            	+lrwx------ 1 root root 64  53 -> socket:[643046315]
        	            	+lrwx------ 1 root root 64  54 -> socket:[643047837]
        	            	+lrwx------ 1 root root 64  55 -> socket:[643046278]
        	            	+lrwx------ 1 root root 64  56 -> socket:[643047888]
        	            	+lrwx------ 1 root root 64  57 -> socket:[643048254]
        	            	+lrwx------ 1 root root 64  58 -> socket:[643051841]
        	            	+lrwx------ 1 root root 64  59 -> socket:[643052820]
        	            	 l-wx------ 1 root root 64  6 -> pipe:[642757432]
        	            	+lrwx------ 1 root root 64  60 -> socket:[643050854]
        	            	+lrwx------ 1 root root 64  61 -> socket:[643052877]
        	            	+lrwx------ 1 root root 64  62 -> socket:[643051747]
        	            	+lrwx------ 1 root root 64  63 -> socket:[643052725]
        	            	+lrwx------ 1 root root 64  64 -> socket:[643048281]
        	            	+lrwx------ 1 root root 64  65 -> socket:[643051163]
        	            	+lrwx------ 1 root root 64  66 -> socket:[643048347]
        	            	+lrwx------ 1 root root 64  67 -> socket:[643052735]
        	            	+lrwx------ 1 root root 64  68 -> socket:[643051781]
        	            	+lrwx------ 1 root root 64  69 -> socket:[643053274]
        	            	 lrwx------ 1 root root 64  7 -> /var/log/antrea/antrea-agent.antrea-ipv6-2-1.root.log.INFO.20220808-053257.1
        	            	+lrwx------ 1 root root 64  70 -> socket:[643053333]
        	            	+lrwx------ 1 root root 64  72 -> socket:[643051806]
        	            	+lrwx------ 1 root root 64  73 -> socket:[643051819]
        	            	+lrwx------ 1 root root 64  74 -> socket:[643051945]
        	            	+lrwx------ 1 root root 64  75 -> socket:[643052898]
        	            	 lrwx------ 1 root root 64  8 -> socket:[642759806]
        	            	+lrwx------ 1 root root 64  87 -> socket:[643043096]
        	            	 lrwx------ 1 root root 64  9 -> socket:[642759560]
        	            	+lrwx------ 1 root root 64  91 -> socket:[643043204]
        	            	+lrwx------ 1 root root 64  92 -> socket:[643043257]
        	            	+lrwx------ 1 root root 64  94 -> socket:[643047479]
        	            	+lrwx------ 1 root root 64  95 -> socket:[643046322]
        	            	 
        	Test:       	TestBatchCreatePods
        	Messages:   	FDs were changed after batched Pod creation
    fixtures.go:440: Deleting Pod 'test-pod-p6xgw3bh'
    fixtures.go:440: Deleting Pod 'test-pod-jxlkx8fn'
    fixtures.go:440: Deleting Pod 'test-pod-nl78xn29'
    fixtures.go:440: Deleting Pod 'test-pod-do7lpfal'
    fixtures.go:440: Deleting Pod 'test-pod-20ktb34d'
    fixtures.go:440: Deleting Pod 'test-pod-tjry3jys'
    fixtures.go:440: Deleting Pod 'test-pod-1u9nli40'
    fixtures.go:440: Deleting Pod 'test-pod-cfce87gt'
    fixtures.go:440: Deleting Pod 'test-pod-h0ea15rt'
    fixtures.go:440: Deleting Pod 'test-pod-h610hvfv'
    fixtures.go:440: Deleting Pod 'test-pod-fnfl9kv5'
    fixtures.go:440: Deleting Pod 'test-pod-fz2uyy2g'
    fixtures.go:440: Deleting Pod 'test-pod-8mnc7cny'
    fixtures.go:440: Deleting Pod 'test-pod-glklzfjc'
    fixtures.go:440: Deleting Pod 'test-pod-tq0rtbyi'
    fixtures.go:440: Deleting Pod 'test-pod-z9f1fk8o'
    fixtures.go:440: Deleting Pod 'test-pod-rzxla9cn'
    fixtures.go:440: Deleting Pod 'test-pod-ifusulmd'
    fixtures.go:440: Deleting Pod 'test-pod-bil0ajic'
    fixtures.go:440: Deleting Pod 'test-pod-9zu8mtr2'
    fixtures.go:288: Exporting test logs to '/var/lib/jenkins/workspace/antrea-ipv6-only-e2e-for-pull-request/antrea-test-logs/TestBatchCreatePods/beforeTeardown.Aug08-05-50-30'
    fixtures.go:400: Error when exporting kubelet logs: error when running journalctl on Node 'antrea-ipv6-2-0', is it available? Error: <nil>
    fixtures.go:433: Deleting 'testbatchcreatepods-fkmovzrl' K8s Namespace
I0808 05:50:39.842962   20916 framework.go:643] Deleting Namespace testbatchcreatepods-fkmovzrl took 2.903177ms
--- FAIL: TestBatchCreatePods (31.80s)

To Reproduce

Expected

Actual behavior

Versions:

Antrea: main branch ( v1.7+)

Additional context

@wenyingd wenyingd added the kind/bug Categorizes issue or PR as related to a bug. label Aug 8, 2022
@wenyingd wenyingd changed the title e2e test "" is Flaky e2e test "TestBatchCreatePods" is Flaky Aug 8, 2022
@XinShuYang
Copy link
Contributor

@tnqn Hi, the TestBatchCreatePods failure often blocks ipv6 CI pipeline. Do you have any idea about the root cause?

@wenyingd
Copy link
Contributor Author

Having checked agent logs, I do not see re-connection between agent and OVS.

@wenyingd
Copy link
Contributor Author

I doubt the socket is not unix domain socket, but TCP connections from Agent to API Server or to Controller. The same issue is seen on both IPv6 and dual-stack testbed recently, but not found on IPv4 only.

@wenyingd
Copy link
Contributor Author

Having looked through the code, I doubt it is related with the socket is not closed which is used to send Gratuitous IPv6 NDP packet for a new Pod. https://github.com/antrea-io/antrea/blob/main/pkg/agent/util/ndp/ndp.go#L45 . I would try to verify if it can be fixed by closing the socket.

@tnqn
Copy link
Member

tnqn commented Aug 11, 2022

Having looked through the code, I doubt it is related with the socket is not closed which is used to send Gratuitous IPv6 NDP packet for a new Pod. https://github.com/antrea-io/antrea/blob/main/pkg/agent/util/ndp/ndp.go#L45 . I would try to verify if it can be fixed by closing the socket.

Nice catch. Glad to see TestBatchCreatePods still has some value. Do you know why it sometimes can succeed? The FD were garbage collected sometimes?

@wenyingd
Copy link
Contributor Author

I didn't investigate why it succeeds sometimes..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants