-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Radiation plugin: Save amplitudes additionally per rank #4456
Radiation plugin: Save amplitudes additionally per rank #4456
Conversation
Thanks @franzpoeschel for building this. I agree that at some point, it should be converted to complex only. |
Yes, exactly. |
@franzpoeschel what is the status of this pull request. Should I do some testing? |
It would be helpful, yes. It's all implemented, I just need to add some .rst documentation. |
9df0213
to
7e2b8ca
Compare
7e2b8ca
to
c370613
Compare
@franzpoeschel I just saw you pushed a few minutes ago. Should I still review? |
Yes, I only rebased |
c370613
to
0456518
Compare
--e_radiation.distributedAmplitude arg (=0) Additionally output distributed amplitudes per MPI rank.
0456518
to
9f81d6a
Compare
@franzpoeschel You pushed some changes. But I am not sure what changed - could you elaborate? |
I rebased after disentangling this PR from the one with the Juwels templates |
It seems that the CI is just failing some jobs? Otherwise, I've changed nothing. |
I will trigger the CI again |
@franzpoeschel I just tested your pull request using the Two things that this default examole does differently than astrophysics simulations:
|
Can you give me the entire command line call for PIConGPU that you used? |
here is the call of source /.../pr_4456/runs/001_bunch/tbg/handleSlurmSignals.sh mpiexec -np 32 /.../pr_4456/runs/001_bunch/input/bin/picongpu -d 2 8 2 -g 128 3072 128 -s 7500 --periodic 1 0 1 --e_energyHistogram.period 500 --e_energyHistogram.filter all --e_energyHistogram.binCount 1024 --e_energyHistogram.minEnergy 0 --e_energyHistogram.maxEnergy 500000 --e_radiation.period 1 --e_radiation.dump 2 --e_radiation.totalRadiation --e_radiation.start 2800 --e_radiation.end 3000 --e_radiation.distributedAmplitude 1 --e_macroParticlesCount.period 100 --versionOnce
|
Setting |
I will test whether the |
|
I can't currently reproduce the crash, I ran exactly your command line call on a default PIConGPU Bunch simulation.
I'm running this on the K80 partition of Hemera and the memory of that partition is barely sufficient to run the simulation, but it works. What versions of openPMD and ADIOS2 are you using? Or are you using HDF5? Where do you run the setup and with which software environment? Did you change any templates? |
@franzpoeschel Sorry for the late reply.
The radiation plugin itself tried to create a hdf5 file. (the file was created but it is zero bytes in size.) |
I can confirm that the test case runs on the |
I will check the validity of the k80 data asap. |
Nope, even on V100 everything finishes cleanly for me..?
In this environment, I compiled a normal #!/usr/bin/env bash
#SBATCH -n 32
#SBATCH -p fwkt_v100
#SBATCH -A fwkt_v100
#SBATCH --gres=gpu:4
#SBATCH --tasks-per-node=4
binary="$(realpath "$1/bin/picongpu")"
mkdir -p "$2"
cd "$2"
mpirun "$binary" \
-d 2 8 2 \
-g 128 3072 128 \
-s 7500 \
--periodic 1 0 1 \
--e_energyHistogram.period 500 \
--e_energyHistogram.filter all \
--e_energyHistogram.binCount 1024 \
--e_energyHistogram.minEnergy 0 \
--e_energyHistogram.maxEnergy 500000 \
--e_radiation.period 1 \
--e_radiation.dump 2 \
--e_radiation.totalRadiation \
--e_radiation.start 2800 \
--e_radiation.end 3000 \
--e_radiation.distributedAmplitude 1 \
--e_macroParticlesCount.period 100 \
--versionOnce
|
@franzpoeschel that is strange. I doubt that the only difference in running it, that O was using |
I quickly had a look at my k80 data. Everything looks plausible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does look all correct to me. You seem to use the same data for the new method as for the previous lastRad output.
Nevertheless, the spectra have a significant difference in magnitude and range (thus it is not just a factor missing somewhere). I will add details below.
Details on differences: If we take from the default Bunch example the last output at series = io.Series("../runs/003_bunch_k80/simOutput/radiationOpenPMD/e_radAmplitudes_%T_0_0_0.h5", access=io.Access_Type.read_only)
it = series.iterations[3000]
data_all_x_Im = it.meshes["Amplitude"]["x_Im"].load_chunk()
data_all_x_Re = it.meshes["Amplitude"]["x_Re"].load_chunk()
data_all_y_Im = it.meshes["Amplitude"]["y_Im"].load_chunk()
data_all_y_Re = it.meshes["Amplitude"]["y_Re"].load_chunk()
data_all_z_Im = it.meshes["Amplitude"]["z_Im"].load_chunk()
data_all_z_Re = it.meshes["Amplitude"]["z_Re"].load_chunk()
series.flush()
# convert to complex numbers
data_all_x = (data_all_x_Re * 1j * data_all_x_Im)[:,:,0]
data_all_y = (data_all_y_Re * 1j * data_all_y_Im)[:,:,0]
data_all_z = (data_all_z_Re * 1j * data_all_z_Im)[:,:,0] The data is still in PIConGPU units [sqrt{Js}]. To get the intensity in x-polarization, we need to compute the absolute square of the complex amplitude in x-polarization. tmp_old = np.abs((data_all_x)**2) Ploting the data as: plt.pcolormesh(tmp_old, norm=LogNorm())
plt.colorbar() The maximum is |
If we want to just check whether the new per-MPI-rank results in the same final result, we need to sum over all MPI ranks and sum over all times. N_t = 201 # number of openPMD radiation plugin outputs
data_overTime_x = np.zeros((N_t, 32, 128, 1024), dtype=np.complex128)
data_overTime_y = np.zeros((N_t, 32, 128, 1024), dtype=np.complex128)
data_overTime_z = np.zeros((N_t, 32, 128, 1024), dtype=np.complex128)
for i, it in enumerate(series.iterations):
it = series.iterations[it]
data_dist_x = it.meshes["Amplitude_distributed"]['x'].load_chunk()
data_dist_y = it.meshes["Amplitude_distributed"]['y'].load_chunk()
data_dist_z = it.meshes["Amplitude_distributed"]['z'].load_chunk()
series.flush()
data_overTime_x[i, :, :, : ] = data_dist_x
data_overTime_y[i, :, :, : ] = data_dist_y
data_overTime_z[i, :, :, : ] = data_dist_z We sum over time and MI ranks and then compute the absolute square to convert to intensity (again just x-component) tmp_new = np.abs(np.sum(np.sum(data_overTime_x[:, :, :, :], axis=0)[:, :, :], axis=0)**2) Plotting this via:
and results in a maximum of There seems to be no trivial factor missing. |
If we plot the relative difference as follows plt.pcolormesh(tmp_new / tmp_old, norm=LogNorm())
plt.colorbar() The peak radiation is underestimated while the in-between peaks radiation is overestimated by the MPI distributed version. |
@franzpoeschel found a bug in my python script - his code was/is right - just my analysis is wrong.
|
Great! Thanks for checking this :) |
Regarding your open bulet points @franzpoeschel: |
I only forgot checking the bullet, the documentation is there I'd say |
@steindev just ran into the same crash on hemera v100 as I did. |
I shall try again with tbg |
@franzpoeschel with your script it works for me - with |
Yep, I also see the crash with tbg.....................?????? |
It's this line in the template script triggering the crash: # The OMPIO backend in OpenMPI up to 3.1.3 and 4.0.0 is broken, use the
# fallback ROMIO backend instead.
# see bug https://github.com/open-mpi/ompi/issues/6285
export OMPI_MCA_io=^ompio |
This seems to be the same bug that we already saw with chunking enabled in the normal openPMD plugin. ROMIO can be used, but Complete error backtrace is:
And it seems to be this issue: open-mpi/ompi#7795 |
Thanks's @franzpoeschel for investigating this. Since this is not an issue with your code and since it is avoidable with proper settings, I will merge your pull request now. |
As a note: This was very likely triggered by making this plugin parallel, as it was formerly serial |
Unlike the aggregated amplitudes, this uses std::complex to store the output, making things a bit inconsistent. The goal should be to use std::complex for the aggregated output too, but Richard says that some postprocessing still relies on the current format. (In the same instant, we should remove the useless third dimension there)
Output now looks like (with two ranks):
TODO