-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite cucumber test caching (and support logic) #2795
Conversation
e85f386
to
ff097c8
Compare
default: '--require features --tags ~@stress --tags ~@todo', | ||
verify: '--require features --tags ~@todo --tags ~@bug --tags ~@stress -f progress', | ||
jenkins: '--require features --tags ~@todo --tags ~@bug --tags ~@stress --tags ~@options -f progress', | ||
bugs: '--require features --tags @bug', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we completely removing the bugs tag, or is this an accident to leave this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, looks like all bugs are removed below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah this was just a small cleanup. @todo
and @bug
seem to have served the same function, so it is probably easier to have just one.
There are actually quite a few test cases marked as todo that seem weird, we should maybe clean up at some point. (@stress
also seems to be dysfunctional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fully agree on the clean-up.
I tried to use todo as missing feature
and bug as broken feature
. I don't think that this separation serves much of a purpose, though. I am missing a complete overview if there can be important distinctions in the code-base but I agree that it looks like a good candidate to clean-up.
Some tests are currently failing because @lbud does this sound familiar to any bugs you hit back in the day? EDIT: Fixed this. maxBuffer for exec was too small. |
Tests pass for me locally but there seem to be problems on travis. /cc @springmeyer |
I've been poking at this branch. All tests are passing locally, but on travis there are still 60 failures:
And what looks like a bug in our JS code whereby we try to write to the log file after
|
I was able to replicate the above errors on linux in a docker image of Ubuntu Xenial. It looks like the I'm unsure the cause of these failures. But I wonder if the move from |
@springmeyer it is nodejs/node#2098
It worked before on linux because osrm-routed was |
I would totally be in favor of using
Everything is now abstracted using |
tldr; I also think spawn should be fine for us now. @TheMarex you may be remembering the problem I hit with |
Thanks everyone for the help and comments. My next action: convert |
still one issue with the log stream that must be deleted in a callback at https://github.com/Project-OSRM/osrm-backend/pull/2795/files#diff-122ec2288966c104a49c369d3146f016R38 but not with a race condition before child exit |
Thank you @oxidase for the great work on using |
Testing locally and noticing what the # of files inside
But after a run there are 128 files that are newly modified. This a minor problem but still seems unexpected to me: https://gist.github.com/springmeyer/c9ae99dd3f6ead94c1070365893c178e. @TheMarex can you think of a reason why the files in the cache would be overridden even if nothing changed in the osrm-backend code? |
Looking like a race condition. I've found that when I run |
Found and fixed the problem in c1b504d. I noticed that 1 times in 10 the hash returned from osrm-backend/features/lib/hash.js Lines 8 to 21 in c1b504d
|
@springmeyer looks good to me for rebase and merge. The last timeout issue was due to slow md5 sum computation with a cold FS cache (locally for me ~10 seconds). |
9575bc5
to
f95f1f2
Compare
Just found one issue - https://github.com/Project-OSRM/osrm-backend/blob/tests/caching/features/support/hooks.js#L53 always makes osrm-routed shutdown. I will revert it and add osrm-routed logging output to a new stream. |
f95f1f2
to
6fbfdf1
Compare
@TheMarex there is still a problem with a data race if timeout occurs during osrm-extract or osrm-contract. 1b4f087 is a partial "solution", because if timeout occurs during osrm-extract, the scheduled osrm-contract will be started with parameters for the next scenario, but timestamp will be created for the original one. I see two possibilities:
|
@oxidase thanks for your further work on this. I propose we merge this and ticket the remaining issue you are seeing around behavior when a timeout occurs. Since, if I am understanding right, 1) timeouts should be rare, 2) you have ideas for solving handling them better, and 3) we have other PRs that would benefit from this being merged first (notably #2834). Does that sound right? Do you agree that merging this and handling timeout handling in a separate PR would be acceptable? |
Data race occurs in this.processedCacheFile after timeouts in runBin callbacks
6fbfdf1
to
b2d45b7
Compare
I rebased this onto master. Conflicts were just around retagging update: argh messed this up the first time. |
b2d45b7
to
7fdb8f6
Compare
fbbfbfa
to
f6e2ba0
Compare
@springmeyer i have made the first option, because making a global queue with abortable jobs would require too much time. At the moment if preprocessing starts it will finish even if a timeout occurs.Most important is that stamp files correspond to the original scenario. The last change makes consistent only extraction and contraction. If timeout occurs during
To fix it the corrupted OSM files must be removed, but for 5 seconds timeout it is very unlikely. TL;DR: LGTM for merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@springmeyer seems fine. Since it is not touching any features, we are in a good state. I just would love to get 5.4 fixed before introducing major features into master, since porting patches could get painful otherwise. Hopefully we will have this done soon. |
This reverts commit 7d124ce.
Great work here everyone. 🙇 |
Issue
This PR implements #2745. As a result a lot of the support logic has to be reworked. Logs are now handled differently, they are saved by scenario in
test/logs/{feature_path}/{scenario}.log
./cc @MoKob @daniel-j-h @danpat @lbud
Tasklist
Requirements / Relations
None.