-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JOSS review: Documentation #121
Comments
Thanks for the review!
|
Apologies, I had not looked under I understand that the size of the data files may prevent you from distributing them as part of the package. I don't really have an universal solution for that, I would suggest looking at services your institute or HPC center may offer for data storage and distribution. I'm perhaps a bit confused about the issue of data being non-public. If there is no public data, it's difficult to see the benefits of an open source processing tool. I understand you might want to keep unpublished data private, but you should have something published that can be used to run the benchmarks on. I would suggest at least clearly documenting the benchmarks that only make sense to run on large datasets that may not be available as part of the package, and providing data for the benchmarks where you can get meaningful results within the limits of what you can store on GitHub. |
Thanks for the suggestions! Upon inquiring our colleagues, we now find some open-access Vlasiator data. We will update the benchmarks and add the links to the data. |
In Vlasiator.jl v0.9.35, we have updated We have also fixed some issues in comparing the performance between Python/Numpy and Julia, which should now give fairer results. Previous array reading benchmarks in Python have the sorting of cells in the data reading step; now it is moved to the metadata loading step for better comparison with the procedure in Julia. |
I would still like to ask for a bit clearer documentation on Benchmarks. With some guesswork I can link some of the numbers reported in https://henry2004y.github.io/Vlasiator.jl/dev/benchmark/ to the output of |
I would also note that I ran into an error
that seems to be reported in JuliaLang/IJulia.jl#1060. The solution in the first comment of the issue fixed it for me |
I ran this issue as well and indeed the first comment fixed it. However, for some reason it didn't work for the pkgbenchmark that I added recently. In CI, my current workaround is to manually create a symbolic link under the benchmark folder:
Then this works for both CI and local pkgbenchmark. |
Are you running Outputs of running perf.jl on my laptop:
julia> include("perf.jl")
[ Info: Benchmark file 1d_single.vlsv found...
[ Info: Benchmark file bulk.2d.vlsv found...
[ Info: Benchmark file 2d_double.vlsv found...
[ Info: Benchmark file 2d_AFC.vlsv found...
[ Info: Benchmark file 3d_EGI.vlsv found...
(1/3) benchmarking "plot"...
(1/2) benchmarking "contour_nonuniform"...
done (took 21.619722403 seconds)
(2/2) benchmarking "contour_uniform"...
done (took 22.029348411 seconds)
done (took 44.512377048 seconds)
(2/3) benchmarking "load"...
(1/5) benchmarking "bulk.2d.vlsv"...
done (took 1.100194611 seconds)
(2/5) benchmarking "1d_single.vlsv"...
done (took 0.831158252 seconds)
(3/5) benchmarking "2d_double.vlsv"...
done (took 1.328040379 seconds)
(4/5) benchmarking "3d_EGI.vlsv"...
done (took 21.270450767 seconds)
(5/5) benchmarking "2d_AFC.vlsv"...
done (took 21.220161608 seconds)
done (took 46.48213409 seconds)
(3/3) benchmarking "read"...
(1/10) benchmarking "3d_EGI.vlsv_sorted"...
done (took 20.983659774 seconds)
(2/10) benchmarking "3d_EGI.vlsv_unsorted"...
done (took 21.10458052 seconds)
(3/10) benchmarking "bulk.2d.vlsv_sorted"...
done (took 0.883095677 seconds)
(4/10) benchmarking "2d_double.vlsv_sorted"...
done (took 1.296831014 seconds)
(5/10) benchmarking "2d_double.vlsv_unsorted"...
done (took 1.426689205 seconds)
(6/10) benchmarking "2d_AFC.vlsv_unsorted"...
done (took 21.159254651 seconds)
(7/10) benchmarking "2d_AFC.vlsv_sorted"...
done (took 21.333452554 seconds)
(8/10) benchmarking "1d_single.vlsv_unsorted"...
done (took 0.80439711 seconds)
(9/10) benchmarking "bulk.2d.vlsv_unsorted"...
done (took 0.877172364 seconds)
(10/10) benchmarking "1d_single.vlsv_sorted"...
done (took 0.809524429 seconds)
done (took 91.409206892 seconds)
3-element BenchmarkTools.BenchmarkGroup:
tags: []
"plot" => 2-element BenchmarkTools.BenchmarkGroup:
tags: ["2d", "3d"]
"contour_nonuniform" => Trial(541.233 ms)
"contour_uniform" => Trial(672.832 ms)
"load" => 5-element BenchmarkTools.BenchmarkGroup:
tags: ["1d_single.vlsv", "bulk.2d.vlsv", "2d_double.vlsv", "2d_AFC.vlsv", "3d_EGI.vlsv"]
"bulk.2d.vlsv" => Trial(978.434 μs)
"1d_single.vlsv" => Trial(318.396 μs)
"2d_double.vlsv" => Trial(4.148 ms)
"3d_EGI.vlsv" => Trial(642.755 ms)
"2d_AFC.vlsv" => Trial(693.975 ms)
"read" => 10-element BenchmarkTools.BenchmarkGroup:
tags: ["1d_single.vlsv", "bulk.2d.vlsv", "2d_double.vlsv", "2d_AFC.vlsv", "3d_EGI.vlsv"]
"1d_single.vlsv_sorted" => Trial(7.405 μs)
"3d_EGI.vlsv_unsorted" => Trial(3.164 ms)
"bulk.2d.vlsv_sorted" => Trial(31.172 μs)
"2d_double.vlsv_sorted" => Trial(220.135 μs)
"2d_double.vlsv_unsorted" => Trial(86.630 μs)
"2d_AFC.vlsv_unsorted" => Trial(7.016 ms)
"2d_AFC.vlsv_sorted" => Trial(35.849 ms)
"1d_single.vlsv_unsorted" => Trial(6.866 μs)
"bulk.2d.vlsv_unsorted" => Trial(16.736 μs)
"3d_EGI.vlsv_sorted" => Trial(15.840 ms)
----------
11.229765 seconds (885.92 k allocations: 6.243 GiB, 0.32% gc time, 3.78% compilation time)
----------
70.401988 seconds (509.29 k allocations: 12.423 GiB, 7.21% gc time, 1.66% compilation time) Outputs of running perf.py on my laptop:
➜ benchmark git:(master) ✗ python3 perf.py
Using LaTeX formatting
Using matplotlib version 3.5.1
Benchmark file 1d_single.vlsv found...
Benchmark file bulk.2d.vlsv found...
Benchmark file 2d_double.vlsv found...
Benchmark file 2d_AFC.vlsv found...
Benchmark file 3d_EGI.vlsv found...
1d_single.vlsv:
Loading metadata in 0.7706 ms
1d_single.vlsv:
Reading scalar DCCRG variable in 0.0583 ms
bulk.2d.vlsv:
Loading metadata in 1.0458 ms
bulk.2d.vlsv:
Reading scalar DCCRG variable in 0.1664 ms
2d_double.vlsv:
Loading metadata in 2.5013 ms
2d_double.vlsv:
Reading scalar DCCRG variable in 0.2801 ms
2d_AFC.vlsv:
Loading metadata in 298.1129 ms
2d_AFC.vlsv:
Reading scalar DCCRG variable in 33.9831 ms
3d_EGI.vlsv:
Loading metadata in 313.2423 ms
3d_EGI.vlsv:
Reading scalar DCCRG variable in 17.4951 ms
/home/hongyang/Vlasiator/analysator/pyPlots/plot_colormap.py:1306: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
plt.show()
Uniform 2d plotting in 6.4596 s
/home/hongyang/Vlasiator/analysator/pyPlots/plot_colormap3dslice.py:1513: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.
plt.show()
AMR slice plotting in 4.6197 s
Reading FSGrid variable in 104.8189 s |
I was running |
Yeah this is indeed kind of confusing, because currently PkgBenchmarks requires the file benchmarks.jl so I cannot use it for comparing against Python with exactly the same tasks and larger datasets. But we will find a clearer (and automatic) way of doing this! |
benchmarks.jl
usingPkgBenchmarks
but it's not entirely clear to me how the results relate to what is reported in the documentation. The filesperf.jl
andperf.py
probably should not be included in the package as they contain hard-coded file and path names that aren't part of the package.Vlasiator.jl/docs/src/manual.md
Line 233 in feda666
openjournals/joss-reviews#4906
The text was updated successfully, but these errors were encountered: