-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Install PIConGPU on Juwels #3365
Comments
Hello @MaxThevenet , thanks for your report. From the top of my head I do not know the reason of this error. However, while we are investigating it, you could skip this optional dependency and just remove |
Thanks @sbastrakov for your suggestion, I followed instructions at https://picongpu.readthedocs.io/en/0.5.0/usage/basics.html and, after a bunch of warnings (but no error), I received messages
so it seems that the build was successful \o/. Is TBG the usual way to go to run a simulation on Juwels? If so, I'll go through the doc and submit a 1-GPU run. |
@MaxThevenet It looks like somewhat is wrong with the install libpng on juwles, a workaround is to build your own libpng. For that you also need to build your own zlib.
DO not forget to add this to your picongpu profile
|
Hello @MaxThevenet . Regarding launching, yes, tbg is a wrapper for batch submission systems (such as slurm) so that there are unified PIConGPU workflows on all machines. The JUWELS profile you linked above should be ready to work with tbg out-of-the-box. |
Thanks a lot, I will try this. In the meantime, should I be able to run a test simulation? I tried
and the Then, I get two errors:
Then, manually submitting the job with |
Hm, the Slurm option |
@MaxThevenet The looks like the SLURM batch system got an update. |
Ah great, thanks. |
So when submitting
Note that
returns
|
@psychocoderHPC I tried your script, adding One caveat though, I had multiple errors like
etc. with all Overall, we encountered a few things:
Should I open new separate issues for each of these? |
Regarding the
output. I believe it does not concern |
Indeed. A lot of dependencies are optional, and the corresponding plugins are #ifdef -guarded, thus e.g. PNG plugin just does not exist when PNGWriter dependency was not found by cmake, and thus the options also do not exist. |
Regarding the phase space plugin, do you have HDF5 dependency (edit: actually libSplash dependency which in turn depends on HDF5)? The plugin so far relies on that, however not for long as discussed in #3357 |
Thank a lot for clarifying these things. Indeed libSplash seems to have been installed succesfully. My current workflow is to:
What are the actions items to make these easier (essentially, make the |
Thanks for summarizing. My thoughts are below.
That's right!
In case you know a general solution to express it via environment variables of juwels, please share here or via PR (I believe we tried to do that, but apparently turned out not general). In case the project number needs to be hard-coded, you could try the following way. Copy the
That would be very welcome!
I am not sure what is the issue there. Will now take a look on our system first. Meanwhile, if you already have a functioning .cfg file after manual changes, could just copy this file inside your new input directory. Also it is possible to use pic-create to create one input directory from another one (not necessarily from inside our repository). In this case just make sure that after you "copy" (with pic-create), the new directory may have the PIConGPU binary built from the old one. To remove it and re-build (if you modify .param files), remove the .build subdirectory and run pic-build again. |
Sorry, I perhaps misunderstood one point. To clarify, you can disable the phaseSpace plugin (that does not work for unknown yet reason) before submitting a job with tbg. In order to do so, modify your So the contents of .cfg files are directly mapped to the command line parameters of PIConGPU, however they are normally (relatively) more human-readable in a .cfg since grouped by variables and have comments. |
Regarding the phaseSpace. I could not reproduce it on Hemera so far. @MaxThevenet to help investigating, could you provide output of the following commands. Please run them from a directory with compiled PIConGPU (after pic-build) that causes the Btw, generally -h should output all plugins that are available for the current PIConGPU build, which means for dependencies used and species defined. |
Note: The phaseSpace plugin requires @MaxThevenet do you compiled libsplash by your own? |
From that message it seemed so. That's why I would like to see the -v and -h outputs. |
@Anton-Le Could you please check whether SLURM uses |
OK so
and the |
@MaxThevenet Yes there is such an option, by overwriting the default As long as system setting (GPUs per node, memory, etc.) are the same for the partitions |
Wonderful, thanks! |
Or alternatively you can change it here. This value is passed to Normally we make a separate pair of |
Hi everyone, since the latest Juwels update, some module versions are newer than required by PIConGPU according to https://picongpu.readthedocs.io/en/0.5.0/install/dependencies.html GCC/9.3.0 While some module versions are compatible (I did not list all) and others may be installed from source to circumvent this issue (e.g. boost), GCC and CUDA are newer than supported by PIConGPU. Do you know how to fix this or do you have by chance a working PIConGPU version for the new module versions? Thank you very much. |
@Anton-Le Could you please comment on that. You are currently the one running on JUWELS. @TheresaBruemmer with reference to your email: In an offline discussion, @Anton-Le just tod me, that he had no issues with the "newer" boost version 1.74.0. |
pic-buils seems to run smoothly until 85%. Then, I get:
|
@TheresaBruemmer I have never seen such a compile error. @psychocoderHPC, @sbastrakov and @Anton-Le have you ever encountered something like this (in Juelich). As a recap: Please be aware that CUDA/11.0 and Boost/1.74.0 are used here. |
From the error log, this is alpaka error, not PIConGPU. I did not encounter it before, however gcc 9.3 is not officially supported by alpaka (nor by PIConGPU). |
@TheresaBruemmer I have quickly tested on our dev system, which seems to have the same software versions. The 0.5.0 release (same as branch |
@sbastrakov and @PrometheusPi Great, thank you. I pulled the dev again and now it works! |
@sbastrakov Thanks for testing 0.5.0 and dev |
@TheresaBruemmer please remember to close issues if no further questions exist ;) @PrometheusPi ping! |
Since @TheresaBruemmer is no longer at DESY, @MaxThevenet could you comment on the status of this issue? |
Hi all,
We can close it, thanks for your help back then!
Cheers,
Maxence
…-----Original Message-----
From: Richard ***@***.***>
To: ComputationalRadiationPhysics ***@***.***>
Cc: Maxence ***@***.***>; Mention ***@***.***>
Date: Friday, 5 May 2023 9:50 AM CEST
Subject: Re: [ComputationalRadiationPhysics/picongpu] Install PIConGPU on Juwels (#3365)
Since @TheresaBruemmer is no longer at DESY, @MaxThevenet could you comment on the status of this issue? —
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi everyone, I am working with @TheresaBruemmer to start using PIConGPU for ICS simulations (so this issue is related to #3346). Before trying to install on Maxwell, I am trying on Juwels as PIConGPU has already been installed there.
Following the instructions, from my vanilla configuration on Juwels, I did:
This last step, installing PNGwriter, failed at step
make install
with errorDo you know where this comes from? Should I open an issue on the pngwriter repo instead? Note that the
CMake
step worked well, see CMake_output.txt for the CMake output.The text was updated successfully, but these errors were encountered: