Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEACAS tests failing in ATDM CUDA builds starting 4/26/2018 #2650

Closed
bartlettroscoe opened this issue Apr 27, 2018 · 43 comments
Closed

SEACAS tests failing in ATDM CUDA builds starting 4/26/2018 #2650

bartlettroscoe opened this issue Apr 27, 2018 · 43 comments
Labels
client: ATDM Any issue primarily impacting the ATDM project PA: Data Services Issues that fall under the Trilinos Data Services Product Area pkg: seacas type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Apr 27, 2018

CC: @trilinos/seacas, @gsjaardema, @fryeguy52

Next Action Status

Updated SEACAS is also causing mesh reading problems on non-CUDA builds for larger numbers of MPI ranks. PR #2653 was merged on 4/27/2018. which reverts PR #2625 updating SEACAS. New issue will be opened if next SEACAS snapshot cause an error.

Description

As shown in the query:

The SEACAS tests:

  • SEACASIoss_exodus32_to_exodus32
  • SEACASIoss_exodus32_to_exodus32_pnetcd
  • SEACASIoss_exodus32_to_exodus64

are failling in all of the current ATDM Trilinos CUDA builds:

  • Trilinos-atdm-hansen-shiller-cuda-debug
  • Trilinos-atdm-hansen-shiller-cuda-opt
  • Trilinos-atdm-white-ride-cuda-debug
  • Trilinos-atdm-white-ride-cuda-opt

This was likely due to the update of SEACAS into Trilinos in the commit 89d48ad merged in the PR #2625 .

Steps to Reproduce

One should be able to reproduce these failing tests on the machines white (SON), ride (SRN), hansen (SON), or shiller (SRN) as described in:

For example, on white one should be able to reproduce these failing tests with:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-debug

$ cmake \
  -GNinja \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_SEACAS=ON \
  $TRILINOS_DIR

$ make NP=16

$ bsub -x -Is -q rhel7F -n 16 ctest -j16
@bartlettroscoe bartlettroscoe added type: bug The primary issue is a bug in Trilinos code or tests pkg: seacas client: ATDM Any issue primarily impacting the ATDM project labels Apr 27, 2018
@bartlettroscoe
Copy link
Member Author

@gsjaardema,

Just curious, but did you get a CDash email about these failures like the one shown below? It looks like Trilinos is currently set up to send email to the address seacas-regression at software.sandia.gov. It looks like that mailman list exists. It is set up to send you emails?


From: CDash [mailto:trilinos-regression@sandia.gov]
Sent: Thursday, April 26, 2018 11:23 AM
To: Bartlett, Roscoe A
Subject: FAILED (t=3): Trilinos/SEACAS - Trilinos-atdm-hansen-shiller-cuda-debug - ATDM

A submission to CDash for the project Trilinos has failing tests.
You have been identified as one of the authors who have checked in changes
that are part of this submission or you are listed in the default contact list.

Details on the submission can be found at
https://testing.sandia.gov/cdash/buildSummary.php?buildid=3530258

Project: Trilinos
SubProject: SEACAS
Site: hansen
Build Name: Trilinos-atdm-hansen-shiller-cuda-debug
Build Time: 2018-04-26T15:19:09 UTC
Type: ATDM
Tests failing: 3

Tests failing
SEACASIoss_exodus32_to_exodus32_pnetcdf
(https://testing.sandia.gov/cdash/testDetails.php?test=47122041&build=3530258)
SEACASIoss_exodus32_to_exodus32
(https://testing.sandia.gov/cdash/testDetails.php?test=47122042&build=3530258)
SEACASIoss_exodus32_to_exodus64
(https://testing.sandia.gov/cdash/testDetails.php?test=47122043&build=3530258)

-CDash on testing.sandia.gov

@bartlettroscoe
Copy link
Member Author

These tests all seem to be terminating early with the error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  cudaDeviceSynchronize() error( cudaErrorCudartUnloading): driver shutting down /home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-debug/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Cuda/Kokkos_Cuda_Impl.cpp:119

@rppawlo
Copy link
Contributor

rppawlo commented Apr 27, 2018

It's more than just cuda. @pwxy @bathmatt are seeing failures on other machines as well when you go to larger MPI process counts.

@pwxy
Copy link

pwxy commented Apr 27, 2018

Broke EMPIRE on mutrino HSW and KNL. For example for mutrino HSW when reading an exodus meshed decomposed into 8 submains, EMPIRE is fine. But when try to read the same exodus mesh decomposed into 16 domains, get the following errors:

Exodus Library Warning/Error: [ex_check_valid_file_id]
ERROR: In "ex_inquire_internal", the file id -1 was not obtained via a call to "ex_open" or "ex_create".
It does not refer to a valid open exodus file.
Aborting to avoid file corruption or data loss or other potential problems.

@bartlettroscoe
Copy link
Member Author

Is there some way to define a native SEACAS test that can show these failures and then fix the failing test? Can the failure being described be demonstrated with a smaller number of MPI ranks?

@bathmatt
Copy link
Contributor

Can we revert it until this is all worked out? I've confirmed on my standard RHEL:7 desktop this breaks reading exodus files with more than 9 mpi ranks. No idea on why

@bathmatt
Copy link
Contributor

@bartlettroscoe I'm betting it happens in panzer tests as well, I'll run mini-EM and verify

@bathmatt
Copy link
Contributor

The legendary @pwxy figured out that if you remove the 0s in the decomposed mesh mesh.16.9 and not mesh.16.09 it now works.. What this on purpose? Does decomp make the right meshes?

@pwxy
Copy link

pwxy commented Apr 27, 2018

Actually it was the AMAZING GENIUS @bathmatt who figured this out!

@bathmatt
Copy link
Contributor

mini-EM works with the mesh decomped. No idea now what is goig on. We use the same mesh reader.

@gsjaardema
Copy link
Contributor

gsjaardema commented Apr 27, 2018 via email

@gsjaardema
Copy link
Contributor

gsjaardema commented Apr 27, 2018 via email

@bathmatt
Copy link
Contributor

sems versions, the empire 0# thing is internal to some error checking in EMPIRE, not sure why it this error changes stuff, did something change in
int file_id = ex_open_int(file.c_str(), mode, &comp_ws, &io_ws, &version, EX_API_VERS_NODOT);
I can work around that issue, but the cuda one I'm not sure... Don't revert for the EMPIRE issue, I will fix it on our end.

@gsjaardema
Copy link
Contributor

@bartlettroscoe CUDA tests are failing during application shutdown. My guess is that the SEACAS standalone has older version of Kokkos and the newer Trilinos/Kokkos has different shutdown behavior or requirements... Will try to verify.

@pwxy
Copy link

pwxy commented Apr 27, 2018

For mutrino:
hdf5 1.10.1
netcdf 4.4.1.1

@gsjaardema
Copy link
Contributor

@bathmatt There were changes to ex_open_int, but primarily they should have been limited to error checking. There are some issues in some NetCDF versions based on some defines that should be there but are missing which might mess things up. I will try with SEMS and see.

I will hold off on reverting unless others request...

@bartlettroscoe
Copy link
Member Author

What library versions of netcdf and hdf5 are being used for these builds?

@gsjaardema, if you look at the configure output on CDash at, for example:

it shows:

Processing enabled TPL: HDF5 (enabled explicitly, disable with -DTPL_ENABLE_HDF5=OFF)
-- HDF5_LIBRARY_NAMES='hdf5;z;hdf5_hl'
-- TPL_HDF5_LIBRARIES='-L/home/projects/x86-64-haswell-nvidia/hdf5/1.10.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib;-lhdf5_hl;-lhdf5;-lz;-ldl'
-- TPL_HDF5_INCLUDE_DIRS='/home/projects/x86-64-haswell-nvidia/hdf5/1.10.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/include'
Processing enabled TPL: Netcdf (enabled explicitly, disable with -DTPL_ENABLE_Netcdf=OFF)
-- Netcdf_LIBRARY_NAMES='netcdf'
-- TPL_Netcdf_LIBRARIES='-L/home/projects/x86-64/boost/1.55.0/lib;-L/home/projects/x86-64-haswell-nvidia/netcdf-exo/4.4.1.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib;-L/home/projects/x86-64-haswell-nvidia/netcdf-exo/4.4.1.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib;-L/home/projects/x86-64-haswell-nvidia/pnetcdf-exo/1.8.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib;/home/projects/x86-64/boost/1.55.0/lib/libboost_program_options.a;/home/projects/x86-64/boost/1.55.0/lib/libboost_system.a;/home/projects/x86-64-haswell-nvidia/netcdf-exo/4.4.1.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib/libnetcdf.a;/home/projects/x86-64-haswell-nvidia/pnetcdf-exo/1.8.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib/libpnetcdf.a;-L/home/projects/x86-64-haswell-nvidia/hdf5/1.10.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/lib;-lhdf5_hl;-lhdf5;-lz;-ldl'
-- TPL_Netcdf_INCLUDE_DIRS='/home/projects/x86-64-haswell-nvidia/netcdf-exo/4.4.1.1/openmpi/2.1.1/gcc/4.9.3/cuda/8.0.61/include'
Processing enabled TPL: BoostLib (enabled explicitly, disable with -DTPL_ENABLE_BoostLib=OFF)

I think thee were installed by the test bed team. If those need upgrade, then we need to contact them.

@gsjaardema
Copy link
Contributor

There are some potential issues with hdf5-1.10.1 especially when used with an older netcdf in that it can potentially create files that are not readable with older versions of hdf5-1.8.X. The HDF5 group fixed this in hdf5-1.10.2 with special build options --with-default-api-version=v18 and we added a patch to NetCDF-4.6.2-devel to select v1.8 compatibility. Issues don't always appear and if using consisten library versions it should be OK.

I probably need to get more involved in the SEMS discussions to avoid some of this...

@gsjaardema
Copy link
Contributor

@bartlettroscoe The configure output you are showing seems to also indicate that it isn't using the FindNetcdf.cmake that is in TriBITS? It should be setting some other TPL_Netcdf_* symbols that don't seem to be there. The output I usually see is something like:

-- Found NetCDF: /Users/gdsjaar/src/seacas-parallel/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libpnetcdf.a
-- NetCDF Version: netCDF 4.6.2-development
--      NetCDF_NEEDS_HDF5        = True
--      NetCDF_NEEDS_PNetCDF     = True
--      NetCDF_PARALLEL          = True
--      NetCDF_INCLUDE_DIRS      = /Users/gdsjaar/src/seacas-parallel/include;/Users/gdsjaar/src/seacas-parallel/include;/Users/gdsjaar/src/seacas-parallel/include
--      NetCDF_LIBRARIES         = /Users/gdsjaar/src/seacas-parallel/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libpnetcdf.a
--      NetCDF_BINARIES          = ncdump;ncgen;nccopy
-- Netcdf_LIBRARY_NAMES='netcdf'
-- TPL_Netcdf_LIBRARIES='/Users/gdsjaar/src/seacas-parallel/lib/libnetcdf.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5_hl.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libhdf5.dylib;/usr/lib/libz.dylib;/usr/lib/libdl.dylib;/usr/lib/libm.dylib;/Users/gdsjaar/src/seacas-parallel/lib/libpnetcdf.a'
-- TPL_Netcdf_INCLUDE_DIRS='/Users/gdsjaar/src/seacas-parallel/include;/Users/gdsjaar/src/seacas-parallel/include;/Users/gdsjaar/src/seacas-parallel/include'
Processing enabled TPL: CGNS (enabled explicitly, disable with -DTPL_ENABLE_CGNS=OFF)

The main symbols that I need are NetCDF_NEEDS_HDF5, NetCDF_PARALLEL, and NetCDF_NEEDS_PNetCDF = True | False

@bartlettroscoe
Copy link
Member Author

@bartlettroscoe I'm betting it happens in panzer tests as well, I'll run mini-EM and verify

@bathmatt, no all of the panzer tests and examples fully passed on all of the builds we currently have running as shown in the CDash query:

(Ignore the one failure on 'ride' for the build Trilinos-atdm-white-ride-gnu-opt-openmp. We have seen tests randomly fail on 'ride' that pass just fine on the identical machine 'white'. That is why this build on 'ride' was demoted to the "Specialized" CDash Track/Group. See #2511.)

Did this updated Trilinos fail any EMPIRE automated tests? If not, then someone needs to add an automated test to either SEACAS (best), Panzer (okay) or EMPIRE (if nothing else) to cover this use case.

@micahahoward, you might want to be aware of this in case this impacts SPARC on your next update of Trilinos.

All,

Unless some changes in SEACAS are urgent for some Trilinos customer, we can just back out the merge commit from PR #2625 so that people can fix this offline in a non stressful way.

@bartlettroscoe
Copy link
Member Author

The configure output you are showing seems to also indicate that it isn't using the FindNetcdf.cmake that is in TriBITS?

@gsjaardema, this is using the EMPIRE configure of Trilinos copied from the scripts in the EM-Plasma/BuildScripts/ repo. If we can update the ATDM configuration to better use the FindNetcdf.cmake module (hopefully in an updated TplFindNetcdf.cmake module), then we can use it. But that would require careful testing on every platform before we could push that to the 'develop' branch. Or we would have to make the change, demote all of the ATDM builds going to the "ATDM" CDash Track/Group back down to the "Specialized" CDash Track/Group, and then cross our fingers. This is what I did with the last major upgrade of the ATDM Trilinos configuration changes (when we last sycned with the configuration in the scripts in the EM-Plasma/BuildScripts/ repo which was a while ago now).

@gsjaardema
Copy link
Contributor

@bartlettroscoe I will add a SEACAS test covering the use case, but I'm not sure what use case is failing currently (other than the CUDA-related ones).

Since I won't be able to do much until Wednesday, may be best to back out the merge commit from #2625. It was not urgent for any customers.

@bartlettroscoe
Copy link
Member Author

@gsjaardema,

Since I won't be able to do much until Wednesday, may be best to back out the merge commit from #2625. It was not urgent for any customers.

Okay, so unless there is an objection, I am going to back this merge commit out.

@bathmatt
Copy link
Contributor

The EMPIRE issue has been resolved with changes in it. You probably strengthen checks in SEACAS that were now triggering. But that issues on EMPIRE are resolved. Now the cuda stuff is a different matter.

@gsjaardema
Copy link
Contributor

@bartlettroscoe Not sure I understand issue with FindNetcdf.cmake in TriBITs? I thought that all Trilinos builds used the TriBITs code and that we had fully vetted the FindNetcdf issues several months ago?

@pwxy
Copy link

pwxy commented Apr 27, 2018

@bathmatt I backed out your change from "0" to "EX_API_VERS_NODOT" in the ex_open_int call yesterday when I was debugging, and it didn't help the problem with empire failing to read the exodus files
int file_id = ex_open_int(file.c_str(), mode, &comp_ws, &io_ws, &version, EX_API_VERS_NODOT);

@gsjaardema
Copy link
Contributor

@pwxy, @bathmatt: You should not be calling ex_open_int in your application. You should call ex_open as you always have in the past. The ex_open_int is an internal only function that is called by the ex_open wrapper which automatically adds the EX_API_VERS_NODOT to verify that the include file matches the include file used when the library was compiled.

This has been in place for several years, so wasn't particular to this commit.

Please go back to ex_open

@gsjaardema
Copy link
Contributor

If you ever think that there was a non-backward-compatible change to exodus, please let me know before changing any code. There are a few deprecated functions, but they are still usable. You should never need to change your application code for a new Exodus version unless I explicitly mention it in an email or release notes or other notification.

@bathmatt
Copy link
Contributor

@gsjaardema, There was a non backward compatible change only in the extent that I was doing something wrong and getting away with it and you added error checking that caught it, shame on you, shame on you :)

I'd have seen it if I ran periodic meshes on more than 9 ranks with bad values.

@gsjaardema
Copy link
Contributor

@bathmatt OK, I was just a little worried about references to the ex_open_int function. I need to see if I can hide it better. I had a user yesterday also trying to use it...

@bartlettroscoe
Copy link
Member Author

Not sure I understand issue with FindNetcdf.cmake in TriBITs? I thought that all Trilinos builds used the TriBITs code and that we had fully vetted the FindNetcdf issues several months ago?

@gsjaardema, as shown at:

  • export ATDM_CONFIG_NETCDF_LIBS="-L${BOOST_ROOT}/lib;-L${NETCDF_ROOT}/lib;-L${NETCDF_ROOT}/lib;-L${PNETCDF_ROOT}/lib;-L${HDF5_ROOT}/lib;${BOOST_ROOT}/lib/libboost_program_options.a;${BOOST_ROOT}/lib/libboost_system.a;${NETCDF_ROOT}/lib/libnetcdf.a;${PNETCDF_ROOT}/lib/libpnetcdf.a;${HDF5_ROOT}/lib/libhdf5_hl.a;${HDF5_ROOT}/lib/libhdf5.a;-lz;-ldl"

and

the EMPIRE configuration of Trilinos bypasses the FindNetcdf.cmake find module and just directly sets the include dirs and libraries. This mode is allowed to support direct setting and backward compatibility as per:

It is possible to update this Trilinos configuration of Trilinos to allow the use find_package(NetCDF) but that will take a lot of testing, including against builds of EMPIRE and manual testing by EMPIRE developers and users to do that safely (the native Panzer and EMPIRE test suites don't test all functionality from SEACAS that is used by EMPIRE developers and users, see TRIL-171).

As we discussed before, we don't have any specific documentation in the Trilinos build reference for how to configure with this specialized Netcdf setup. Therefore, we can't expect people to know how to use this.

@pwxy
Copy link

pwxy commented Apr 27, 2018

@gsjaardema I think that user was me. It was the exact ex_open_int() in empire I mentioned above. I was trying to track down the issue with failing to read exodus files with more than 9 MPI, so I was looking at the exodus calls in empire.

@gsjaardema
Copy link
Contributor

@bartlettroscoe RE: FindNetcdf.cmake. OK, I understand. We will probably need to modify the environment.sh and ATDMDevEnvSettings.cmake to add manual definitions of some of the symbols set in FindNetcdf.cmake in order to make sure the builds are consistent.

Alternatively, I will see if I can determine the settings down in the SEACAS CMake-related code at configure time which may be more robust than relying on manual settings...

@gsjaardema
Copy link
Contributor

@pwxy I was ptlin, but may have been related to the same issue since the symptoms seemed similar...

@bartlettroscoe
Copy link
Member Author

Alternatively, I will see if I can determine the settings down in the SEACAS CMake-related code at configure time which may be more robust than relying on manual settings...

@gsjaardema, let me know if you have any difficulty reproducing any of these ATDM builds. I tried to make it as easy as I could think to make it. Just source a single script with the build name that you want and run raw cmake passingin a single *.cmake file and enable any package you want.

@pwxy
Copy link

pwxy commented Apr 27, 2018

@gsjaardema Well, that ptlin guy is a moron!

bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Apr 27, 2018
…b_snapshot"

This reverts commit 1b19c57, reversing
changes made to aa0c96b.

There are some issues with this update that is documented in trilinos#2650.  This
reverts the updates in PR trilinos#2650.  Reverting these changse will allow the
issues to be fixed offline in a non urgent way.
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Apr 27, 2018
…b_snapshot"

This reverts commit 1b19c57, reversing
changes made to aa0c96b.

There are some issues with this update that are documented in trilinos#2650.  This
reverts the updates pulled in from PR trilinos#2625.  Reverting these changes will
allow the issues to be fixed offline in a non urgent way.
@bartlettroscoe
Copy link
Member Author

@gsjaardema, I created the PR #2653 to revert this merge commit from #2625. Can you please approve it?

bartlettroscoe added a commit that referenced this issue Apr 27, 2018
…hot-pr-2625

Revert "Merge pull request #2625 from gsjaardema/seacas_github_snapshot"

This is temp fix for some issues with this update that are documented in #2650.
The issues can now be addressed offline.
@bartlettroscoe
Copy link
Member Author

@gsjaardema approved the PR #2653 that reverted this SEACAS update and it passed testing and I merged it just now. Therefore, we should see these CUDA failures go away and if EMPIRE updates its Trilinos develop branch, the issues should be gone now.

I will leave this issue open to track efforts to address these issues offline.

Note that to help address this, you can build Trilinos with the version of SEACAS from the independent SEACAS git repo. You just clone (or symlink) @gsjaardema,'s SEACAS git repo under the main Trilinos git repo like:

$ cd Trilinos/
$ git clone git@github.com:gsjaardema/seacas.git

Then when you configure Trilinos, add the cache var:

-D SEACAS_SOURCE_DIR_OVERRIDE:STRING=seacas

NOTE: This is exactly how SPARC builds Trilinos with SEACAS currently.

That might make it easier to iteratively debug and fix the issues.

@bathmatt
Copy link
Contributor

Thanks, the EMPIRE issue is resolved minus the cuda issue.

@bartlettroscoe
Copy link
Member Author

The merge was backed out on 4/27/2018. Should we keep this issue open still or should we close it.

@gsjaardema, steps to reproduce with the CUDA build are given at the top of this issue. Note that you can work with your native SEACAS repo by cloning or symlinking the seacas repo under Trilinos/ and then configure, build, and test with:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-debug

$ cmake \
  -GNinja \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DSEACAS_SOURCE_DIR_OVERRIDE:STRING=seacas \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_SEACAS=ON \
  $TRILINOS_DIR

$ make NP=16

$ bsub -x -Is -q rhel7F -n 16 ctest -j16

If that does not work, please let me know.

@bartlettroscoe
Copy link
Member Author

This was resolved by backing out the merge commit of the latest SEACAS snapshot over 2 weeks ago. If a failure occurs on the next SEACAS snapshot, then we will open an new Issue for that.

@gsjaardema
Copy link
Contributor

This should be able to be closed. The seacas source code in Trilinos is up-to-date with both SEACAS github and SEACAS/Sierra and all tests are passing and there have been no reports of issues from other projects.

@bartlettroscoe
Copy link
Member Author

@gsjaardema, I closed this issue back on 5/21/2018 as noted above. I figured that if there were any new issues on a new snapshot of SEACAS, then we would open new issues.

Just to verify, looking at the SEACASIoss_exodus32_XXX tests running in the CUDA builds on hansen builds yesterday, we can see that the tests:

  • SEACASIoss_exodus32_to_exodus32
  • SEACASIoss_exodus32_to_exodus32_pnetcd
  • SEACASIoss_exodus32_to_exodus64

reported failing above are passing in all of the CUDA builds.

NOTE: We currently don't have working CUDA builds on 'white' and 'ride' due to a system upgrade (see
TRIL-215).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ATDM Any issue primarily impacting the ATDM project PA: Data Services Issues that fall under the Trilinos Data Services Product Area pkg: seacas type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

5 participants