-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement automated pull request testing for Trilinos #1155
Comments
Is there no work done on this yet? What is the ETA for when we might get this implemented for Trilinos? |
@jwillenbring, you indicated offline that @allevin has already been working on this for some time. If that is the case, why is this Issue not put "In Progress" and have its status logged? Why is it still listed as "Ready" in the Trilinos Framework Kanban board? |
@jwillenbring and I had a meeting with Kitware staff (Brad K., Bill H. and Ben B.) to discuss a new automation system for GitLab merge-requests they have put together for testing and merging changes using a topic-branch workflow to CMake called "Ghostflow" which is short for "git hosted workflow" which was outlined in the slides they presented: In a nutshell, what Ghostflow does is to create a new throwaway integration test branch after every change the the main development branch (e.g. 'master') or to any of the marked (i..e with the 'stage' command) PR branches. (Note: this throwaway integration test branch called 'stage' is really more like the 'pu' (proposed upstream) branch described in gitworkflows(7) and less like the 'next' branch described in that workflow.) This throwaway integration test branch 'stage' is then run through a more detailed set of automated builds on a number of different platforms each day (or more frequently if one would like). The advantage of this approach over the simpler independent branch testing is that it scales better with the number of branches. For example, the CUDA builds for Trilinos for ATDM take upwards of 10 hours or more to run (and that is without rebuilds testing package-by-package so this is O(7x) or more that basic GCC build). For those platforms, we can't test each PR branch independently. This would allow us to test all of the branches. The PR testing workflow that we would construct given Ghostflow might be something like:
This is just an idea of what this workflow would look like. The tweaks and knobs would be:
However, the Ghostflow workflow we were presented is not yet ready to implement a process like described above. In particular, the following would need to be added or implemented by Kitware:
But Kitware could not start working on this until Ben B. got back from vacation on July 5th. And they estimated it could take a few months to implement this. So the earliest we could potentially get this implemented and used would be into September 2017 or later. Therefore, the decision was to wait and see how the refactoring work that @allevin is leading is progressing and then make a decision to if to look into Ghostflow with Kitware after the July 5th. I will add a reminder to bring up this topic at a future Kitware contract meeting after July 5th. (But the work that @allevin is doing with Jenkins could likely be directly leveraged in the updated Ghostflow implementation for Trilinos.) |
@allevin Told me this morning that he has a rough version of this ready for me to try next week. He will write up instructions. There is still a gap in directing people to the results in CDash and other smaller gaps to fill. The initial implementation will not support testing multiple pull requests simultaneously. |
As part of TriBITSPub/TriBITS#154, the TRIBITS_CTEST_DRIVER() functioin will create a text file that will contain the URL to the result on CDash. The python code can then read that file and print that URL in the GitHub comment. It will not take long to do that so let me know when you want that to use. (That is also needed for nice Travis CI output pointing to CDash). |
Any recent update on this? I will be gone on vacation for the next 2 weeks and I fear a breakdown of the CI process using the checkin-test-sems.sh script (see #1304). The above comment from 24 days ago said that it should have been ready to test about 15 days ago or so. |
As described in TriBITSPub/TriBITS#214 (comment) you should have everything you need from CMake/CTest and TriBITS to implement this. Improvements can be made from there (like embedding the exact CDash URL but you can always give the genertic URL and you can give the SHA1 in the build name so it is easy to find). Let's set up a meeting between @allevin and myself for when I get back from vacation the week of 8/28 to finish this. |
I have been testing Aaron's work along with some Jenkins jobs I put together to do very basic testing of Trilinos. This has been largely successful in the narrow scope of testing without reporting to CDash (which is the scope @maherou encouraged me to pursue given other difficulties). An example of a pull request that underwent successful testing with this tool (but is lacking final inspection) can be found here: The current path forward is as follows:
|
@jwillenbring My personal opinion is that you should not do automatic merging. This prevents other people from a chance to look at your code and notice the problems. I understand that it will slow down the process but the improved quality should compensate for that. |
@aprokop you mention that you're concerned more people won't have time to review pull requests properly. What would you consider to be an acceptable amount of reviewers to look over a pull request? Do you have any thoughts on how to prevent a pull request from being trapped in limbo waiting for the necessary reviews? |
This is more of a social issue, and depends on the PR type (new feature/bug fix/comment fix). I would say for high priority things the author is allowed to merge once CI tests pass. For other smaller sized patches, couple days would be my guess for a deadline. For a brand new feature, depends. |
What Kitware implemented with their "ghostflow" system with GitHub was to have people add commands in the comments that fire off builds, do the final merges, etc. See: That is the system to study I think. |
I should have said more here. The auto merge will not occur until a review is performed on the pull request (the review settings are configurable, but I will describe the current plan). If someone is outside of the Trilinos organization, and the branch containing the changes is not on the trilinos/Trilinos repo, then that review must occur before the testing will even kick off. This is to allow people to make sure things aren't happening like insertion of malicious code, or changing code in a way that works on one machine, but disables all the testing so that it doesn't fail on another. If someone is part of the Trilinos organization, or the branch is containing the changes is on the trilinos/Trilinos repo, then the review can occur anytime (before, after, or during testing), but must occur before the auto merge will take place. As currently configured, one person needs to approve the changes, and there need to be 0 changes requested. So, if someone goes in and looks at it and says "good", but before the testing finishes someone else says "try this, request changes" then it won't merge, but if the second reviewer doesn't get there before the testing stops, it will merge. So, if someone agrees with the changes, but wants someone else to look, they should just comment, not approve. |
@jwillenbring Thanks for the explanation, it sounds a lot better than what I had in mind before.
Is this keyword based? Or Github review "Request Changes"? |
Github review "Request Changes" |
Archiving email update on this so poeple can find this. From: Trilinos-developers [mailto:trilinos-developers-bounces@trilinos.org] On Behalf Of Willenbring, James M All, CRITICAL SUMMARY
Today the pull request testing tool was run successfully for the first time on the main Trilinos repository. Following the successful run of the 4.9.3 and 4.8.4 tests, I merged the change to develop. The PR was in support of the SIERRA/Trilinos integration process. You can see the details here: As discussed at TUG, there are a lot of improvements we still want to make to the pull request testing. However, I feel it is in good enough shape to execute the soft launch we have been planning. Here are the critical points to remember:
Here are some useful links if you are not familiar with the pull request workflow: https://help.github.com/articles/about-pull-requests/ https://help.github.com/articles/creating-a-pull-request-from-a-fork/ https://help.github.com/articles/about-forks/ https://help.github.com/articles/syncing-a-fork/ Keep in mind as you read this information that we use “develop” as the branch we apply changes directly to for Trilinos, not “master”. We are currently trying to gauge the need for training. Many developers are very comfortable with the pull request workflow and use it regularly. If you would like to receive some training, please send me (jmwille@sandia.gov) an email and let me know what concepts need clarification. We can then try to figure out if we should provide specific links, group training, 1 on 1 training, etc. I plan to begin the soft launch tomorrow. If there are questions or concerns, let me know. Thanks Jim |
The pull request testing soft launch is live. Pull requests are being tested, but passing tests is not required. The next major objective is to make the output of the tests more easily available to developers. |
Updated documentation on the usage of TriBITS CTest -S driver scripts is now posted at: Please have a look and please not any typos you might see. I will provide the skeleton for a script to drive the Jenkins PR branch testing later today. It should be very little code. Then we can discuss what needs to be added to TribitsCTestDriverCore.cmake in order to better support Trilinos PR testing. It don't think it is a lot of work to provide as much as one would want for this use case. |
I sent the following email to the trilinos-developers list today: All, I want to update you on the Trilinos Framework Team’s efforts to deploy Pull Request testing for Trilinos. Skip the “Details” (literally) if you are not interested. Results for a Pull Request build using GCC 4.8.4 can now be found under the “Pull Request” group on this dashboard: https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos. Details: This dashboard is using a version of CDash that is under testing. However, we cannot use the testing.sandia.gov CDash dashboard because the submissions made by the pull request testing require a newer version of CDash than is available on that server. You can now designate a Pull Request as a “work in progress”, using the “AT: WIP” label in GitHub, instead of putting [WIP] in the title of the pull request. Details: The label works the same as the title keyword. “AT” in the name of the label stands for “Autotester”. This prefix will be used for all future labels related to the autotester, which runs the Pull Request testing. You can now request a “retest” of your pull request using the “AT: RETEST” label in GitHub. Details: Reasons why you may want to request another set of tests against your pull request include: a. The pull request testing previously failed for a non-code related issue. For example, a communication failure with GitHub. b. The pull request testing previously failed due to a defect not directly related to your pull request and you want to test to see if the defect has been fixed with a PR that has been applied since your PR was last tested. c. The pull request testing previously passed, but other PR’s have been applied since. The framework team is still contemplating if we should deem tests “stale” after some amount of time, or if we should require tests to be run immediately prior to applying a PR, but for now if there is any concern that a PR might not pass the testing with the current state of the develop branch, it is easy to request a new set of tests. The largest current impediment to deploying pull request testing more fully is that the new testing driver for pull request testing that submits to CDash does not reliably return pass/fail correctly to GitHub in all situations. Details: Until we can count on the new driver to reliably provide Pass/fail information for the corner cases we have discovered, we need to keep running the old-style jobs that do not report to the dashboard. We are working on improved documentation for the Pull Request testing capability. We remain committed to a phased deployment approach. Details: Future planned improvements include deeming test results to be “stale” after a period of time (and requiring a new set of tests to run prior to merging), enforcing that all status checks pass prior to merge, A modified “automerge” concept that will give developers more control of requesting that things be merged when possible, and strengthened requirements for GitHub reviews that will also allow exceptions to be made when appropriate. Additionally, we are looking seriously at adding a CUDA build, and removing the GCC 4.9.3 build (as the errors caught with this build that are not caught by the 4.8.4 build are few). Jim |
Thanks Jim! I would very much like to see a CUDA build, even if it only runs a small subset of tests. |
With topic branch wokflows more common, it is better to use --no-rebase by default to be safe. Build/Test Cases Summary Enabled Packages: Teuchos Disabled Packages: PyTrilinos,Claps,TriKota 0) MPI_RELEASE_DEBUG_SHARED_PT => passed: passed=137,notpassed=0 (2.71 min)
@jwillenbring, would we not consider this done? The auto PR testing system has been mandatory for several months now. |
ThCan we close? We don't need lingering framework issues contributing to the bloat of 1000+ open Trilinos issue. |
This has been done for some time. |
@trilinos/framework
@allevin is planning to develop a capability that will allow for testing of Trilinos pull request prior to having the changes applied to the develop branch. This story will track that work. A rough outline for the work is as follows:
Adapt some existing scripts that test the pull requests for another project to be more general and serve the Trilinos case.
Deploy on a limited, opt-in basis.
Deploy on a full basis for all Trilinos develop branch modifications.
The text was updated successfully, but these errors were encountered: