Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADIOS plugin #901

Closed
dayalsoap opened this issue Jun 1, 2015 · 6 comments
Closed

ADIOS plugin #901

dayalsoap opened this issue Jun 1, 2015 · 6 comments
Assignees
Labels
component: plugin in PIConGPU plugin refactoring code change to improve performance or to unify a concept but does not change public API
Milestone

Comments

@dayalsoap
Copy link

From our discussion the other week, we want to start testing out the ADIOS plugin with some of the in-situ/staging methods for ADIOS, Flexpath in particular.. From going over the code, and during the discussion, we found a few issues.

  1. At each output epoch, a new file name is created. For the staging methods file names are used to establish connectivity between different processes. Re-establishing the communication graphs at each output epoch cause a lot of overhead.

  2. At each output epoch, there a new set of variables is defined. This causes some issues for us because we use variable names to create serialization formats, so for each output epoch, this translates into a new format.

  3. Array dimensions and offsets are passed in as numeric values and not as variables. ADIOS allows you to specify variables that will hold the array dimensions and offsets. The only caveat is that those variables have to be written via adios_write before the array is written.

I am working on some of this now, but also wanted to know if using the XML document was a better option? Using define vars is perfectly fine, but there are some advantages to using an XML document. Using an XML document however would require some bigger changes to the plugin I believe. Basically, I don't think you get variable ID's with the xml document.

@PrometheusPi PrometheusPi added the component: plugin in PIConGPU plugin label Jun 2, 2015
@PrometheusPi PrometheusPi added this to the Open Beta milestone Jun 2, 2015
@ax3l ax3l added refactoring code change to improve performance or to unify a concept but does not change public API and removed question labels Jun 3, 2015
@ax3l
Copy link
Member

ax3l commented Jun 3, 2015

Thank you for collecting all the points in the issue! ✨

  1. and 2) are basically since we were afraid, from our experiences with HDF5, that a crash in the code could corrupt previous outputs if the file handles are still open. In addition, we can both dump checkpoints and regular outputs with the same routines (but containing different data sets). This requires some refactoring but will be possible!

  2. If we would use the xml API: you are right we can use variables for dimensions. But that requires us to write a XML file at runtime with all the information, adding an additional XML writer and introducing additional overhead, adding an additional layer of abstracting that information too and adds XML lib dependencies. As a side note, variable IDs are currently only used (in a non-fancy way) to spare to define some information twice between calculating the buffers, opening the files and actually writing the data.

Since XML requires a lot of restructuring for us and does not naturally fit into our implementation: can we go with the C-API, too?

We would also need to write actually two files, one for checkpointing and one for dumps... And we would have to expose a lot of information in variables such as moving window simulations, additional variables that are dumped in other intervals etc. We can try to add this in the long term but it will make the whole thing way more complex on our side (and from my experience during the implementation of #679 it would take some time without scientific output on our side).

@dayalsoap
Copy link
Author

So in regards to the XML document, why would it have to be written programmatically? I'm guessing this is because the variable names aren't the same across configurations?

The no_xml is perfectly fine, anyways. We can still define array dimensions/offsets as variables. This way, the defines would only need to be called once after the adios_init call. Does the set of output variables change within a run?

@ax3l
Copy link
Member

ax3l commented Jun 3, 2015

the xml documents would have to be written programatically, since the users can specify for the default core but also for various plugins how often and what they should dump (and calculate). An example is the input file fileOutput.param which is completely free to user input and additionally compiled-in for each run. User-side command-line input might also influence the output.

If no_xml is possible we can probably start way faster, that is great! The set of output variables are currently determined by either compiled-in or by command-line arguments. But we have plugins such as our 3D live rendering, where a connected scientist can actually select at run time which variables he/she would like to analyse (really selecting an other source during the run; ideally creating a new source from existing combinations in the future as in the ParaView Calculator pipeline element).

Nevertheless, for the start and in the current implementation (fileOutput.param + checkpoints) we can assume the set is known for the whole sim at the point of the adios_init call. Can we still vary the output periodicity reacting on non-fixed in-simulation events?

@dayalsoap
Copy link
Author

Using the no_xml API is definitely fine. I guess the only real issue there is changing the I/O method, and the parameters given to the I/O method, would require a recompile, or perhaps they could be passed in via the command line or some other input file.

Can we still vary the output periodicity reacting on non-fixed in-simulation events?

Absolutely.

@dayalsoap
Copy link
Author

So, there definitely is a risk in terms of crashes and losing the whole BP file. The solution for this I've been told, by the ADIOS developers, is to use separate files, which is how the ADIOS plugin currently works. Maybe a better option for me then is to just write a separate ADIOS plug in geared towards the in-situ/staging transports?

@ax3l
Copy link
Member

ax3l commented Jun 14, 2015

thank you for the information!

there is an other aspect with using just a "single" file that I got from
upstream: the parallel "append" is not even close to the performance of
regular writes, unfortunately.

update: afaik, append performance was drastically improved in ADIOS 1.9 or 1.10 - needs dbl check.

@ax3l ax3l modified the milestones: Future, 0.2.0: Open Beta Nov 11, 2016
psychocoderHPC pushed a commit to psychocoderHPC/picongpu that referenced this issue Feb 20, 2020
c147cf1a5d hack for native clang usage for HIP
ceead8c719 Merge pull request ComputationalRadiationPhysics#926 from psychocoderHPC/fix-implicitCAstWarning
86a8b8def8 test: fix implicit cast warning
f90d1dc515 Merge pull request ComputationalRadiationPhysics#924 from BenjaminW3/topic-version-0.5
b1d2a8d866 Increase version to 0.5.0
167ca262f8 Merge pull request ComputationalRadiationPhysics#923 from BenjaminW3/topic-omp-num-threads-1
be8bf55791 Merge pull request ComputationalRadiationPhysics#920 from SimeonEhrig/bufferCopyComment
4162d26b61 Fix exception in TaskKernelCpuOmp2Blocks when OMP_NUM_THREADS==1
daffff6252 Add comment to pitch function at the bufferCopy example
284eef5113 Merge pull request ComputationalRadiationPhysics#916 from BenjaminW3/topic-gh-action
0a4969e5d0 Merge pull request ComputationalRadiationPhysics#917 from sbastrakov/doc-addBufferPinning
4924eb0bd6 Merge pull request ComputationalRadiationPhysics#918 from BenjaminW3/topic-remove-commented-out-sections
fa2ce0ceef Add info on pinning to the CUDA mapping docs
68deac0768 Remove commented out code
be64df2d29 Add automated gh-pages deployment for branch develop
cb0e27819b Merge pull request ComputationalRadiationPhysics#915 from sbastrakov/topic-c++14HelperTypes
adf11e573c Use C++ helper types for traits
8decb8d5b4 Merge pull request ComputationalRadiationPhysics#914 from sbastrakov/doc-addCuplaReference
9f528697f3 Merge pull request ComputationalRadiationPhysics#913 from sbastrakov/fix-ExampleCommentAcceleratorList
07e455b637 Add a reference to cupla to readme
a06f345e7f Add forgotten TBB accelerator to the list in comments of the examples
432331fcc7 Merge pull request ComputationalRadiationPhysics#909 from BenjaminW3/topic-c++14-2
40bfeaaee7 Incorporate review comments
74e0ffa006 Remove meta::IntegerSequence
71afe1f0bb Remove unused includes
572777ed5a Remove unused TransformIntegerSequence
e52b90d920 Remove unused meta::IndexSequence
ed5b5f8d9b Prepare usage of std::integer_sequence
aa0635525d Use std::integer_sequence instead of own IntegerSequence
6b914ca157 Use std::integer_sequence instead of own IntegerSequence for NdLoop
9811f23a30 Merge pull request ComputationalRadiationPhysics#910 from ax3l/topic-removeBetaStatusDev
9f3f01bb40 remove beta status
b99acc704c Merge pull request ComputationalRadiationPhysics#907 from BenjaminW3/topic-integer_sequence
b2db39d599 Merge pull request ComputationalRadiationPhysics#906 from BenjaminW3/topic-increase-min-boost
3992f097cf Raise minimum supported boost version from 1.62.0 to 1.65.1
53f74a28ee Merge pull request ComputationalRadiationPhysics#904 from BenjaminW3/topic-increase-min-ubuntu
1b346420de Replace alpaka::meta::IndexSequence with C++14 std::index_sequence
7180827504 Merge pull request ComputationalRadiationPhysics#900 from BenjaminW3/topic-c++14
be03160b3c Remove Support for ubuntu 14.04
bb3d6c49f0 Raise minimum to -std=c++14 and remove support for CUDA 8.0 and gcc 4.9
7910971a54 Merge pull request ComputationalRadiationPhysics#899 from BenjaminW3/topic-xcode-11_3
29234ffcc2 Add support for XCode 11.3
5135bdb27b Merge pull request ComputationalRadiationPhysics#901 from BenjaminW3/topic-fix-tbb-win-download
bcd6d46ef6 Fix TBB installation
REVERT: ab0b8a460f Merge pull request ComputationalRadiationPhysics#905 from psychocoderHPC/fix-tbb-win-download
REVERT: d7471b9381 Merge pull request ComputationalRadiationPhysics#903 from psychocoderHPC/topic-removeBetaStatus
REVERT: 13c06f9667 Fix TBB installation
REVERT: ea6b56b0fb remove beta status

git-subtree-dir: thirdParty/alpaka
git-subtree-split: c147cf1a5d69e9f553986566a571298d92b856f5
psychocoderHPC pushed a commit to psychocoderHPC/picongpu that referenced this issue Feb 20, 2020
48972eb593 hack for native clang usage for HIP
4eaff438cb Increase version to 0.5.0
b5f4402022 Fix exception in TaskKernelCpuOmp2Blocks when OMP_NUM_THREADS==1
7569489385 Add comment to pitch function at the bufferCopy example
0e1757dfff import ComputationalRadiationPhysics#864
b1042de4d3 HIP-clang support
284eef5113 Merge pull request ComputationalRadiationPhysics#916 from BenjaminW3/topic-gh-action
0a4969e5d0 Merge pull request ComputationalRadiationPhysics#917 from sbastrakov/doc-addBufferPinning
4924eb0bd6 Merge pull request ComputationalRadiationPhysics#918 from BenjaminW3/topic-remove-commented-out-sections
fa2ce0ceef Add info on pinning to the CUDA mapping docs
68deac0768 Remove commented out code
be64df2d29 Add automated gh-pages deployment for branch develop
cb0e27819b Merge pull request ComputationalRadiationPhysics#915 from sbastrakov/topic-c++14HelperTypes
adf11e573c Use C++ helper types for traits
8decb8d5b4 Merge pull request ComputationalRadiationPhysics#914 from sbastrakov/doc-addCuplaReference
9f528697f3 Merge pull request ComputationalRadiationPhysics#913 from sbastrakov/fix-ExampleCommentAcceleratorList
07e455b637 Add a reference to cupla to readme
a06f345e7f Add forgotten TBB accelerator to the list in comments of the examples
432331fcc7 Merge pull request ComputationalRadiationPhysics#909 from BenjaminW3/topic-c++14-2
40bfeaaee7 Incorporate review comments
74e0ffa006 Remove meta::IntegerSequence
71afe1f0bb Remove unused includes
572777ed5a Remove unused TransformIntegerSequence
e52b90d920 Remove unused meta::IndexSequence
ed5b5f8d9b Prepare usage of std::integer_sequence
aa0635525d Use std::integer_sequence instead of own IntegerSequence
6b914ca157 Use std::integer_sequence instead of own IntegerSequence for NdLoop
9811f23a30 Merge pull request ComputationalRadiationPhysics#910 from ax3l/topic-removeBetaStatusDev
9f3f01bb40 remove beta status
b99acc704c Merge pull request ComputationalRadiationPhysics#907 from BenjaminW3/topic-integer_sequence
b2db39d599 Merge pull request ComputationalRadiationPhysics#906 from BenjaminW3/topic-increase-min-boost
3992f097cf Raise minimum supported boost version from 1.62.0 to 1.65.1
53f74a28ee Merge pull request ComputationalRadiationPhysics#904 from BenjaminW3/topic-increase-min-ubuntu
1b346420de Replace alpaka::meta::IndexSequence with C++14 std::index_sequence
7180827504 Merge pull request ComputationalRadiationPhysics#900 from BenjaminW3/topic-c++14
be03160b3c Remove Support for ubuntu 14.04
bb3d6c49f0 Raise minimum to -std=c++14 and remove support for CUDA 8.0 and gcc 4.9
7910971a54 Merge pull request ComputationalRadiationPhysics#899 from BenjaminW3/topic-xcode-11_3
29234ffcc2 Add support for XCode 11.3
5135bdb27b Merge pull request ComputationalRadiationPhysics#901 from BenjaminW3/topic-fix-tbb-win-download
bcd6d46ef6 Fix TBB installation
REVERT: ab0b8a460f Merge pull request ComputationalRadiationPhysics#905 from psychocoderHPC/fix-tbb-win-download
REVERT: d7471b9381 Merge pull request ComputationalRadiationPhysics#903 from psychocoderHPC/topic-removeBetaStatus
REVERT: 13c06f9667 Fix TBB installation
REVERT: ea6b56b0fb remove beta status

git-subtree-dir: thirdParty/alpaka
git-subtree-split: 48972eb59308971c29f1ee10aa374190c591c585
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: plugin in PIConGPU plugin refactoring code change to improve performance or to unify a concept but does not change public API
Projects
None yet
Development

No branches or pull requests

4 participants