-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TpetraCore CrsMatrix pack and unpack test failures on clean build #1395
Comments
This error doesn't occur with the standard sems checkin script, I am looking at it now with the jenkins environment that shows failure |
|
@mhoemmen, perhaps? There is strange behavior on the test that intentionally sends in bad |
@tjfulle I'm stuck in meetings all day today and tomorrow or I'd help you out :( |
@mhoemmen, the source of the error is in |
@tjfulle my bad :( can you fix it? feel free to push. Thanks! |
I've got the fix in and the tests are running as we speak :) |
The single standard CI build is a build with debug-mode checking turned on P.S. Note that a PR model (like should be getting worked on in #1155) that runs several builds for each independent PR branch is identical to running the checkin-test-sems.sh script with extra builds (as long as all of the builds run are using the same SEMS env, which would be the case here). The difference is that you supply the hardware to run the builds and tests (and therefore is more scalable than a centralized PR approach). |
@bartlettroscoe, I think adding an optional |
Thanks @tjfulle ! :-D @bartlettroscoe Usually I'll do MPI_RELEASE with CUDA, since some packages have had trouble in the past finishing the debug build (libraries too large, so the linker fails) with NVCC. |
@mhoemmen, PR is in :) |
#1398 addresses these failures. @jwillenbring, this issue can be closed when the clean build shows up as clean. |
I have added it in my local checkin-test-sems.sh script and am running it locally before pushing. |
@bartlettroscoe wrote: "am running it locally before pushing" -- does that refer to @tjfulle 's PR, or your changes? |
@mhoemmen , my PR was generated by a commit pushed by the SEMS checkin script already |
This build failed in the "Clean" dashboard of Trilinos and there was a request in #1395 to add this. That MPI full optimized/release build is an important customer build so developers might consider running it more before pushing. Build/Test Cases Summary Enabled Packages: Disabled Packages: PyTrilinos,Claps,TriKota Enabled all Packages 0) MPI_RELEASE_DEBUG_SHARED_PT => Test case MPI_RELEASE_DEBUG_SHARED_PT was not run! => Does not affect push readiness! (-1.00 min) 1) MPI_RELEASE_SHARED_PT => passed: passed=2338,notpassed=0 (101.69 min)
I accidentally set Trilinos_ENABLE_DEBUG=ON which just made this build identical to the standard MPI_RELEASE_DEBUG_SHARED_PT build (which is not helpful). Build/Test Cases Summary Enabled Packages: Disabled Packages: PyTrilinos,Claps,TriKota Enabled all Packages 0) MPI_RELEASE_DEBUG_SHARED_PT => Test case MPI_RELEASE_DEBUG_SHARED_PT was not run! => Does not affect push readiness! (-1.00 min) 1) MPI_RELEASE_SHARED_PT => passed: passed=2334,notpassed=0 (8.17 min)
FYI: I ran the MPI_RELEASE_SHARED_PT build and it passed and pushed (see below). @tjfulle and @mhoemmen, this means that if you want to test the MPI fully optimized build, you can do that by adding:
But this is not guaranteed to be exactly the same as the "Clean" build P.S. The strange thing is that I got an MPI abort error in the test FEI_fei_ubase_MPI_3. But when I ran the full Trilinos test suite again, it passed (and that is what is shown below). I don't remember the last time I got a random MPI failure on this machine running the MPI_RELEASE_DEBUG_SHARED_PT CI build. Not sure what that means.
|
@jwillenbring, these two @trilinos/tpetra tests show up as failures on cdash again this morning. @bartlettroscoe and I have independently run the test suite with all tests passing. Are the tests on cdash using the most updated code? |
CDash tells you exactly what version of Trilinos is being tested and how to see what new commits were pulled since the last time a build was run. See instructions on finding that info at: |
@bartlettroscoe Just curious -- why is it that changing Tpetra enables FEI tests? FEI doesn't depend on Tpetra at all. It's likely a prerequisite of Panzer, so I understand why it needs to be built, but why should it need to be tested? |
Thanks @bartlettroscoe! It looks like the cdash errors reported are using a checkout of Trilinos that does not yet have yesterday's fixes. |
FYI: Panzer does not have a dependency on FEI. It was dropped as a required package a while ago. |
Just configure Trilinos with:
and look at the STDOUT output. It tells you exactly why. See:
So FEI depends on ML depends on Isorropia depends on Tpetra and there you have it (sort of like everyone on Hollywood is related to Kevin Bacon in less than 6 steps). This a a major problem with Trilinos. Trilinos packages need to be better structured into subpakages and then have the dependencies controlled between subpackages. I will write a story for this and try to raise awareness of this problem (once again). Any questions? |
Panzer depending on FEI would not enable the FEI tests, just the libs. FEI needs to have an (indirect) upstream dependency on Tpetra to do that (which it has). |
@trilinos/tpetra tests all pass on the clean builds this morning. |
@bartlettroscoe wrote:
Thanks for clarifying :-) . I didn't realize Isorropia depended on Tpetra. I think that was just someone's abandoned research project. If I do the following, would that decouple FEI from Tpetra?
If so, I'll open a new issue to do this. |
Yup, that is the type of thing you have to do. Note that Thyra has subpackages ThyraTpetra and ThyraEpetra and downstream packages depends on one or the other (sometimes both) as can be see by:
But what kills you are sloppy indirect dependencies that end up enabling a bunch of extra downstream dependencies anyway. I am going to create a more general Issue for this to get this issue on the map so that Trilinos developers can be alerted to start doing this type of work to better control the dependencies in Trilinos (which has many advantages, not just speeding up CI testing). There are likely a dozen or more package in Trilinos that need to be broken down into subpackages and then a bunch more packages that need to narrow their dependence on just the subpackages they need. |
Thanks @bartlettroscoe ! I think it will help us a lot to purge unnecessary dependencies. A big problem is that people add experimental features that never get enabled in the nightlies, but force optional dependencies that do get enabled. More intelligent use of subpackages could help with that. Also, we have branches and PRs now; we should those instead of polluting the repo with some random's untested dissertationware. |
CC: @trilinos/framework
@trilinos/tpetra
There have been two Tpetra test failures on the GCC 4.9.3 MPI build
Linux-gcc-4.9.3-MPI_Release_gcc_4.9.3_openmpi_1.8.7_DEV
for the last week:http://testing.sandia.gov/cdash/viewTest.php?onlyfailed&buildid=2934080
It is important to resolve these issues, either by fixing the issue or disabling the test so we can move in the direction of making automated decisions based on 100% clean builds.
The text was updated successfully, but these errors were encountered: