Skip to content
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.

Check for dead sockets before timeout and enforce timeouts #806

Merged
merged 5 commits into from
Aug 21, 2019

Conversation

juanjux
Copy link
Contributor

@juanjux juanjux commented Aug 16, 2019

Fixes #800 at least in Linux (once merged I'll open issues for Darwin and Windows).
Related to src-d/gitbase-spark-connector-enterprise#81

This supersedes #801, the timeout checking part is basically the same with minor changes. What this adds is another goroutine that, using the Linux /proc filesystem will check if the socket is closed and cancel the connection and queries in that case.

While this is mergeable, I'll ask to wait until I say it can be merged because I want to try to get rid of the timeout checking goroutine. The current problem is that currently even if there is a timeout, the socket state won't change to CLOSE_WAIT until there is a real read() or write() to it so the timeout checking goroutine is still needed, but I wan't to try to configure keepalives for the connection and see if with this the timeout actually changes the socket state (and thus the connection checking goroutine can reap it). Also I would like to get an Ok from @lwsanty that found a problem in the previous PR.

Bonus: a couple typos fixed in the tests.

cc @lwsanty : please check that this branch works for you with your previously failing test (and if you can link the test so I can maybe add it to the unittests of this project).

@juanjux juanjux changed the title Check for dead sockets before timeout and enforce timeouts [Dont merge yet] Check for dead sockets before timeout and enforce timeouts Aug 16, 2019
@juanjux juanjux requested a review from lwsanty August 16, 2019 15:31
- Added test for row timeout

- Review feedback and some simplified logic for the row read select

- RowTimeout: Use a single NewTimer instead of After for better memory usage

- Restore default timeout and readTimeout values

- Refactor rowloop select

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux juanjux force-pushed the deadsocket-check branch 2 times, most recently from e806bcd to 0ce415e Compare August 16, 2019 15:37
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@juanjux juanjux force-pushed the deadsocket-check branch 2 times, most recently from 81f8664 to 141ab18 Compare August 16, 2019 15:52
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
…ux only)

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@src-d src-d deleted a comment from golangcibot Aug 16, 2019
@juanjux juanjux changed the title [Dont merge yet] Check for dead sockets before timeout and enforce timeouts [Dont merge] Check for dead sockets before timeout and enforce timeouts Aug 16, 2019
@juanjux juanjux requested a review from a team August 16, 2019 23:11
internal/sockstate/netstat_linux.go Show resolved Hide resolved
internal/sockstate/netstat_darwin.go Show resolved Hide resolved
internal/sockstate/netstat_linux.go Outdated Show resolved Hide resolved
func tcpSocks(accept AcceptFn) ([]sockTabEntry, error) {
f, err := os.Open(pathTCPTab)
defer func() {
_ = f.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're ignoring the error of f.Close() then just defer f.Close()

Copy link
Contributor Author

@juanjux juanjux Aug 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so GolangCI shut ups (and if he doesn't this doesn't pass the merge checks), which it doesn't which just a defer close()

internal/sockstate/netstat_windows.go Show resolved Hide resolved
internal/sockstate/sockstate.go Outdated Show resolved Hide resolved
Copy link
Contributor

@lwsanty lwsanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved since my suite passes.
According to tests that you wanted to add, here's my suite https://github.com/src-d/regression-gitbase/blob/a4afd0e8b11096bfc98fb722cc2f8d3cdec6ec37/cmd/regression-bblfsh-mockups/main.go#L52

But I think these cases are represented in your tests on the lower level.

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Copy link
Contributor

@agarciamontoro agarciamontoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

server/handler.go Outdated Show resolved Hide resolved
@juanjux
Copy link
Contributor Author

juanjux commented Aug 20, 2019

Yesterday I tried many things to be able to get rid of the timeout-enforcing goroutine, without success. I'll try a couple things more today and change the status to mergeable anyway. Windows support will come in another PR so I don't stall this too much, since it's very needed.

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux juanjux changed the title [Dont merge] Check for dead sockets before timeout and enforce timeouts Check for dead sockets before timeout and enforce timeouts Aug 20, 2019
@juanjux
Copy link
Contributor Author

juanjux commented Aug 20, 2019

Ok, so tried keepalives (higher Go level and lower level using syscalls), deadlines, and many other things. Looks like we can't get rid of the second goroutine so since this works this could be merged when reviews pass. PTAL @erizocosmico @ajnavarro.

I'll work on adding Windows support for the dead socket detector on another PR.

Copy link
Contributor

@ajnavarro ajnavarro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check formatting. After that, LGTM

server/context.go Outdated Show resolved Hide resolved
server/handler_linux_test.go Outdated Show resolved Hide resolved
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux
Copy link
Contributor Author

juanjux commented Aug 20, 2019

@ajnavarro gofmt ran on the entire project and goimports on the modified directories.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Long processing queries don't stop if the connection is dead/killed until the timeout triggers
5 participants