-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flaky test] TestFlowAggregator/IPv4/InterNodeFlows #3650
Labels
area/flow-visibility/aggregator
Issues or PRs related to Flow Aggregator
kind/bug
Categorizes issue or PR as related to a bug.
kind/failing-test
Categorizes issue or PR as related to a consistently or frequently failing test.
Comments
@heanlan @dreamtalen could someone take a look at this? |
@antoninbas ack |
heanlan
added a commit
to heanlan/antrea
that referenced
this issue
Apr 19, 2022
Conntrack connection store's polling go routine and flow exporter both access to conntrack connection store, and there's a race condition error. In the polling go routine, `deleteIfStaleOrResetConn` and `AddOrUpdateConn` both grab the lock, modify `conn.IsPresent` field, and release the lock. Between the execution of these two functions, it is likely that FlowExporter's timer is triggered and it reads the wrong `conn.IsPresent` value in an intermidiate state. We fix it by holding the lock until we finish the execution of both two functions. Fixes: antrea-io#3650 Signed-off-by: heanlan <hanlan@vmware.com>
heanlan
added a commit
to heanlan/antrea
that referenced
this issue
Apr 19, 2022
Conntrack connection store's polling go routine and flow exporter both access to conntrack connection store, and there's a race condition error. In the polling go routine, `deleteIfStaleOrResetConn` and `AddOrUpdateConn` both grab the lock, modify `conn.IsPresent` field, and release the lock. Between the execution of these two functions, it is likely that FlowExporter's timer is triggered and it reads the wrong `conn.IsPresent` value in an intermidiate state. We fix it by holding the lock until we finish the execution of both two functions. Fixes: antrea-io#3650 Signed-off-by: heanlan <hanlan@vmware.com>
heanlan
added a commit
to heanlan/antrea
that referenced
this issue
Apr 21, 2022
Conntrack connection store's polling go routine and flow exporter both access to conntrack connection store, and there's a race condition error. In the polling go routine, `deleteIfStaleOrResetConn` and `AddOrUpdateConn` both grab the lock, modify `conn.IsPresent` field, and release the lock. Between the execution of these two functions, it is likely that FlowExporter's timer is triggered and it reads the wrong `conn.IsPresent` value in an intermidiate state. We fix it by holding the lock until we finish the execution of both two functions. Fixes: antrea-io#3650 Signed-off-by: heanlan <hanlan@vmware.com>
antoninbas
pushed a commit
that referenced
this issue
Apr 25, 2022
Conntrack connection store's polling goroutine and flow exporter both access to conntrack connection store, and there's a race condition error. In the polling go routine, `deleteIfStaleOrResetConn` and `AddOrUpdateConn` both grab the lock, modify `conn.IsPresent` field, and release the lock. Between the execution of these two functions, it is likely that FlowExporter's timer is triggered and it reads the wrong `conn.IsPresent` value in an intermediate state. We fix it by holding the lock until we finish the execution of both functions. Fixes: #3650 Signed-off-by: heanlan <hanlan@vmware.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/flow-visibility/aggregator
Issues or PRs related to Flow Aggregator
kind/bug
Categorizes issue or PR as related to a bug.
kind/failing-test
Categorizes issue or PR as related to a consistently or frequently failing test.
Describe the bug
I observed a failure of TestFlowAggregator/IPv4/InterNodeFlows in Kind CI, for the "E2e tests on a Kind cluster on Linux with AntreaProxy all Service support" job (don't know if the exact job is relevant):
Full test logs: https://gist.github.com/antoninbas/f543aa2e9eb2c34cba41edf9825bf39e
The text was updated successfully, but these errors were encountered: