-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds monitoring of services when opening ui to eventually fail fast #186
Conversation
Regarding this PR, I've added latest docker-sdk as dependency, but it required me to also update docker. Do you think it's better to use an older version? |
As discussed in TFM, we're going to remove Go SDK client in favor of grepping from stdout. |
// The reason is that the prefix of a container can be very long, especially for local | ||
// initialization, due to the value that we set for `COMPOSE_PROJECT_NAME` env var, and | ||
// docker-compose may split the name into multiple lines. | ||
// E.g.: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think that the example is too much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes :) just 1 line is enough to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here.
I've manually tested this by running |
} | ||
|
||
// OpenUI opens the browser with the UI. | ||
func OpenUI(timeout time.Duration) error { | ||
var stdout bytes.Buffer | ||
failFast, err := checkFailFast(&stdout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it took me a while to figure what is going on here. Do you think it would be easier to skip only error "container not available" instead? And fail for all other errors. Also, we might want to have the same check inside waitForContainer
. Currently if, for example, docker server crashed we will ignore the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it took me a while to figure what is going on here. Do you think it would be easier to skip only error "container not available" instead? And fail for all other errors.
So you mean that we could avoid checking for those 3 types of errors? It makes sense to me. What do you think @dpordomingo? I'm asking you as you recently refactored this part regarding error handling AFAIR. I don't know whether there are other errors that we want to skip in addition to "container not available".
Also, we might want to have the same check inside
waitForContainer
. Currently if, for example, docker server crashed we will ignore the error.
In this case, the monitor should correctly raise an error and exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, we are checking for 3 errors that we know how to display better. But the function can fail with many other errors we don't know about. They are just ignored now.
I propose to invert the logic. Check only for errors we want to skip and return everything else "fast" (including this 3 errors).
Maybe it's not necessary to do in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the monitor should correctly raise an error and exit.
that is true. But it's a side effect...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to invert the logic. Check only for errors we want to skip and return everything else "fast" (including this 3 errors).
This makes sense. 👍
that is true. But it's a side effect...
Yeah, it's a side effect. The "problem" is that even if we add that check in the waitForContainer
function, probably will be still the monitor to catch that error as it runs more frequently. But having that check even in waitForContainer
will surely increase the understandability of the flow, so I'm ok adding it.
cmd/sourced/cmd/web.go
Outdated
|
||
if err := browser.OpenURL(url); err != nil { | ||
return errors.Wrap(err, "could not open the browser") | ||
var newLineFormatter = regexp.MustCompile(`(\r\n|\r|\n)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to do it manually? strings.TrimSpace
already remove all space characters: https://golang.org/pkg/unicode/#IsSpace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just tried it and yes no need to do it manually. I thought that it would have replace \r\n
with two new lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here.
cmd/sourced/cmd/web.go
Outdated
} | ||
|
||
return nil | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move go
out of this function, it will be easier to understand and would allow to use the function in sync mode if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's mentioned here but not very clear. There was a better explanation somewhere (most probably in go blog or somewhere else, I can't find it right now). The idea is it's better to keep functions blocking and if you need to run them in non-blocking way the caller can do it which is much easier to reason about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fair, thanks for link!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here.
cmd/sourced/cmd/web.go
Outdated
strings.TrimSpace(servicesBuf.String())), "\n") | ||
|
||
for _, service := range services { | ||
go runMonitorService(service, ch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the advantage of using goroutines here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason is to be more reactive to state changes in containers. The alternative that I thought is to remove the while true
and the sleep
from the runMonitorService
function and to write this for loop as:
for {
for _, service := range services {
runMonitorService(service, ch)
}
}
The problem is that the call to docker-compose ps <service>
takes an amount of time near to ~1 second (at least in my setup). So with the non-gorountine solution, the container of a service is checked once every ~n
-seconds where n
is the number of services. With the gorountine solution empirically each service each checked much faster. With the sleep
of 1 second as now, every service is checked every ~4 seconds. It's to be noted that this is based only on empirical observations and that I'm pretty sure also for the gorountine solution the amount of time is a function of n
too.
Long story short, with our current number of services docker answers faster to all our requests if they're made concurrently instead of serially. This may change if in future we add more services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we're happy with a less reactive solution in favor of having easier code, I'm also ok with that. I don't think that a user would get mad for waiting a couple of seconds more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion about one or another approach.
If I was writing it from scratch I would choose simpler code over 2 seconds delay of error reporting (taking into account that the user sees the loader at this time). But if you prefer to keep it as it is now, it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with both really 😄. Let's see also what @dpordomingo and @carlosms say then. 👍
cmd/sourced/cmd/web.go
Outdated
if err := browser.OpenURL(url); err != nil { | ||
return errors.Wrap(err, "could not open the browser") | ||
var newLineFormatter = regexp.MustCompile(`(\r\n|\r|\n)`) | ||
var stateExtractor = regexp.MustCompile(`(?m)^srcd-[\w\d]+.*(Up|Exit (\w+))`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be (Exit (\d+)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ops, nice catch! And also the \d
is redundant in the first part, in the end it will be as follows:
(?m)^srcd-\w+.*(Up|Exit (\d+))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done here.
LGTM, but let's add an entry to the changelog please. |
I completely missed this 😞. I'll do it ASAP. |
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
2a9ce54
to
d0214bf
Compare
Rebased and added changelog entry here. |
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
a3f1337
to
68bb532
Compare
This closes #147.
When the spinner is running and the user is waiting for the ui to open, a monitoring of the state of the containers is performed in order to stop if an unexpected state occurs.
I also ran integration tests on windows.