Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds monitoring of services when opening ui to eventually fail fast #186

Merged
merged 8 commits into from
Aug 15, 2019

Conversation

se7entyse7en
Copy link
Contributor

This closes #147.

When the spinner is running and the user is waiting for the ui to open, a monitoring of the state of the containers is performed in order to stop if an unexpected state occurs.

I also ran integration tests on windows.

@se7entyse7en
Copy link
Contributor Author

Regarding this PR, I've added latest docker-sdk as dependency, but it required me to also update docker. Do you think it's better to use an older version?

cmd/sourced/cmd/web.go Outdated Show resolved Hide resolved
@se7entyse7en
Copy link
Contributor Author

As discussed in TFM, we're going to remove Go SDK client in favor of grepping from stdout.

@se7entyse7en
Copy link
Contributor Author

@carlosms @smacker I updated and force pushed as the implementation completely changed.

// The reason is that the prefix of a container can be very long, especially for local
// initialization, due to the value that we set for `COMPOSE_PROJECT_NAME` env var, and
// docker-compose may split the name into multiple lines.
// E.g.:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that the example is too much?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes :) just 1 line is enough to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here.

@se7entyse7en
Copy link
Contributor Author

I've manually tested this by running sourced init local <path> on a shell, and by running docker kill <gitbase container id> in another.

}

// OpenUI opens the browser with the UI.
func OpenUI(timeout time.Duration) error {
var stdout bytes.Buffer
failFast, err := checkFailFast(&stdout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took me a while to figure what is going on here. Do you think it would be easier to skip only error "container not available" instead? And fail for all other errors. Also, we might want to have the same check inside waitForContainer. Currently if, for example, docker server crashed we will ignore the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took me a while to figure what is going on here. Do you think it would be easier to skip only error "container not available" instead? And fail for all other errors.

So you mean that we could avoid checking for those 3 types of errors? It makes sense to me. What do you think @dpordomingo? I'm asking you as you recently refactored this part regarding error handling AFAIR. I don't know whether there are other errors that we want to skip in addition to "container not available".

Also, we might want to have the same check inside waitForContainer. Currently if, for example, docker server crashed we will ignore the error.

In this case, the monitor should correctly raise an error and exit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we are checking for 3 errors that we know how to display better. But the function can fail with many other errors we don't know about. They are just ignored now.

I propose to invert the logic. Check only for errors we want to skip and return everything else "fast" (including this 3 errors).

Maybe it's not necessary to do in this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the monitor should correctly raise an error and exit.

that is true. But it's a side effect...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to invert the logic. Check only for errors we want to skip and return everything else "fast" (including this 3 errors).

This makes sense. 👍

that is true. But it's a side effect...

Yeah, it's a side effect. The "problem" is that even if we add that check in the waitForContainer function, probably will be still the monitor to catch that error as it runs more frequently. But having that check even in waitForContainer will surely increase the understandability of the flow, so I'm ok adding it.


if err := browser.OpenURL(url); err != nil {
return errors.Wrap(err, "could not open the browser")
var newLineFormatter = regexp.MustCompile(`(\r\n|\r|\n)`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to do it manually? strings.TrimSpace already remove all space characters: https://golang.org/pkg/unicode/#IsSpace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tried it and yes no need to do it manually. I thought that it would have replace \r\n with two new lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here.

}

return nil
go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move go out of this function, it will be easier to understand and would allow to use the function in sync mode if needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's mentioned here but not very clear. There was a better explanation somewhere (most probably in go blog or somewhere else, I can't find it right now). The idea is it's better to keep functions blocking and if you need to run them in non-blocking way the caller can do it which is much easier to reason about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fair, thanks for link!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here.

strings.TrimSpace(servicesBuf.String())), "\n")

for _, service := range services {
go runMonitorService(service, ch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the advantage of using goroutines here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason is to be more reactive to state changes in containers. The alternative that I thought is to remove the while true and the sleep from the runMonitorService function and to write this for loop as:

for {
    for _, service := range services {
        runMonitorService(service, ch)
    }
}

The problem is that the call to docker-compose ps <service> takes an amount of time near to ~1 second (at least in my setup). So with the non-gorountine solution, the container of a service is checked once every ~n-seconds where n is the number of services. With the gorountine solution empirically each service each checked much faster. With the sleep of 1 second as now, every service is checked every ~4 seconds. It's to be noted that this is based only on empirical observations and that I'm pretty sure also for the gorountine solution the amount of time is a function of n too.
Long story short, with our current number of services docker answers faster to all our requests if they're made concurrently instead of serially. This may change if in future we add more services.

Copy link
Contributor Author

@se7entyse7en se7entyse7en Aug 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we're happy with a less reactive solution in favor of having easier code, I'm also ok with that. I don't think that a user would get mad for waiting a couple of seconds more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion about one or another approach.
If I was writing it from scratch I would choose simpler code over 2 seconds delay of error reporting (taking into account that the user sees the loader at this time). But if you prefer to keep it as it is now, it's fine.

Copy link
Contributor Author

@se7entyse7en se7entyse7en Aug 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with both really 😄. Let's see also what @dpordomingo and @carlosms say then. 👍

if err := browser.OpenURL(url); err != nil {
return errors.Wrap(err, "could not open the browser")
var newLineFormatter = regexp.MustCompile(`(\r\n|\r|\n)`)
var stateExtractor = regexp.MustCompile(`(?m)^srcd-[\w\d]+.*(Up|Exit (\w+))`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be (Exit (\d+)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ops, nice catch! And also the \d is redundant in the first part, in the end it will be as follows:

(?m)^srcd-\w+.*(Up|Exit (\d+))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here.

@carlosms
Copy link
Contributor

carlosms commented Aug 9, 2019

LGTM, but let's add an entry to the changelog please.

@se7entyse7en
Copy link
Contributor Author

LGTM, but let's add an entry to the changelog please.

I completely missed this 😞. I'll do it ASAP.

Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
@se7entyse7en
Copy link
Contributor Author

Rebased and added changelog entry here.

CHANGELOG.md Outdated Show resolved Hide resolved
Signed-off-by: Lou Marvin Caraig <loumarvincaraig@gmail.com>
@se7entyse7en
Copy link
Contributor Author

@carlosms force pushed here.

@se7entyse7en se7entyse7en merged commit 21613d7 into src-d:master Aug 15, 2019
@se7entyse7en se7entyse7en deleted the fail-fast branch August 15, 2019 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fail fast during sourced initialization if a component breaks
3 participants