Skip to content

Commit

Permalink
Merge branch 'development' into jmsexton03/sundials-sunmemory-arena
Browse files Browse the repository at this point in the history
  • Loading branch information
jmsexton03 committed Mar 9, 2021
2 parents 18eb53d + 6ec0066 commit 5b6f1ec
Show file tree
Hide file tree
Showing 179 changed files with 1,685 additions and 301 deletions.
32 changes: 32 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# http://EditorConfig.org
#
# precedence of rules is bottom to top

# this is the top-most EditorConfig file
root = true


[*.{c,h,cpp,hpp,H,py}]
# 4 space indentation
indent_style = space
indent_size = 4

# setting it to true would result in too many white changes to amrex
trim_trailing_whitespace = false

# unix-style newlines
end_of_line = lf

# newline ending in files
insert_final_newline = true


[*.md]
# two end of line whitespaces are newlines without a paragraph
trim_trailing_whitespace = false


[Makefile]
# TABs are part of its syntax
indent_style = tab
indent_size = unset
5 changes: 5 additions & 0 deletions .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@
"name": "Graves, Daniel",
"orcid": "0000-0001-9730-7217"
},
{
"affiliation": "Accelerator Technology & Applied Physics Division, Lawrence Berkeley National Laboratory",
"name": "Huebl, Axel",
"orcid": "0000-0003-1943-7141"
},
{
"affiliation": "NVIDIA Corporation",
"name": "Katz, Maximilian",
Expand Down
35 changes: 35 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,3 +1,38 @@
# 21.03

-- Support multiple components in the ABecLaplacian solver. (#1825)

-- Add query functions to Arena. (#1823) They can be used to find out the
memory type and if the memory is accessible on device or host.

-- CMake: add -Wno-pass-failed to Clang-based compilers (#1815)

-- Option to cover multiple cuts in EB generation (#1810)

-- The_Async_Arena and Elixir::append (#1804)

-- Add function name output option to TinyProfiler. (#1803)

-- DPCPP: Use codeplay_host_task (#1797)

-- Add ParticleArray classes (#1796)

-- ParallelCopy_nowait & ParallelCopy_finish (#1765)

-- Add FabArray::tags() to return the tags. (#1794)

-- Bump minimum C++ standard from 11 to 14. (#1787)

-- Update the preferred short name of NVIDIA HPC SDK to nvhpc (#1788)

-- Fix precision issue in ParmParse::add (#1783)

-- Overset Solver with Refinement Ratio of 4 (#1778)

-- Refinement Ratio of 4 Support in Nodal Solver (#1774)

-- MSVC: Proper __cplusplus macro (#1773)

# 21.02

-- Abort when MLMG is detected as failing regardless of verbosity
Expand Down
6 changes: 4 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,10 @@ endif ()
#
# Install amrex -- Export
#
include(AMReXInstallHelpers)
install_amrex_targets(${_amrex_targets})
if(AMReX_INSTALL)
include(AMReXInstallHelpers)
install_amrex_targets(${_amrex_targets})
endif()


#
Expand Down
30 changes: 24 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,9 @@ Development generally follows the following ideas:

* Bug fixes, questions and contributions of new features are welcome!

* Bugs should be reported through GitHub issues
* We suggest asking questions through GitHub issues as well
* *Any contributions of new features that have the potential
to change answers should be done via pull requests.*
* Bugs should be reported through GitHub Issues.
* We suggest asking questions through GitHub Discussions.
* All contributions should be done via pull requests.
A pull request should be generated from your fork of
amrex and target the `development` branch. See below for
details on how this process works.
Expand Down Expand Up @@ -150,7 +149,8 @@ targeted PRs.
For example, if find typos in the documentation open a pull request that only fixes typos.
If you want to fix a bug, make a small pull request that only fixes a bug.
If you want to implement a large feature, write helper functionality first, test it and submit those as a first pull request.
If you want to implement a feature and are not too sure how to split it, just open an issue about your plans and ping other AMReX developers on it to chime in.
If you want to implement a feature and are not too sure how to split it,
just open a discussion about your plans and ping other AMReX developers on it to chime in.
Even before your work is ready to merge, it can be convenient to create a PR
(so you can use Github tools to visualize your changes). In this case, please
Expand Down Expand Up @@ -234,4 +234,22 @@ developer are flexible, but generally involve one of the following:
If a core developer is inactive for multiple years, we may reassess their
status as a core developer.
The current list of core developers is: Ann Almgren (LBNL), Vince Beckner, John Bell (LBNL), Johannes Blaschke (LBNL), Cy Chan (LBNL), Marcus Day (LBNL), Brian Friesen (NERSC), Kevin Gott (NERSC), Daniel Graves (LBNL), Max Katz (NVIDIA), Andrew Myers (LBNL), Tan Nguyen (LBNL), Andrew Nonaka (LBNL), Michele Rosso (LBNL), Sam Williams (LBNL), Weiqun Zhang (LBNL), Michael Zingale (Stony Brook University).
The current list of core developers is:
Ann Almgren (LBNL),
Vince Beckner,
John Bell (LBNL),
Johannes Blaschke (LBNL),
Cy Chan (LBNL),
Marcus Day (LBNL),
Brian Friesen (NERSC),
Kevin Gott (NERSC),
Daniel Graves (LBNL),
Axel Huebl (LBNL),
Max Katz (NVIDIA),
Andrew Myers (LBNL),
Tan Nguyen (LBNL),
Andrew Nonaka (LBNL),
Michele Rosso (LBNL),
Sam Williams (LBNL),
Weiqun Zhang (LBNL),
Michael Zingale (Stony Brook University).
6 changes: 3 additions & 3 deletions Docs/Doxygen/doxygen.conf
Original file line number Diff line number Diff line change
Expand Up @@ -2015,15 +2015,15 @@ ENABLE_PREPROCESSING = YES
# The default value is: NO.
# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.

MACRO_EXPANSION = NO
MACRO_EXPANSION = YES

# If the EXPAND_ONLY_PREDEF and MACRO_EXPANSION tags are both set to YES then
# the macro expansion is limited to the macros specified with the PREDEFINED and
# EXPAND_AS_DEFINED tags.
# The default value is: NO.
# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.

EXPAND_ONLY_PREDEF = NO
EXPAND_ONLY_PREDEF = YES

# If the SEARCH_INCLUDES tag is set to YES, the include files in the
# INCLUDE_PATH will be searched if a #include is found.
Expand Down Expand Up @@ -2055,7 +2055,7 @@ INCLUDE_FILE_PATTERNS =
# recursively expanded use the := operator instead of the = operator.
# This tag requires that the tag ENABLE_PREPROCESSING is set to YES.

PREDEFINED =
PREDEFINED = AMREX_USE_MPI AMREX_USE_GPU AMREX_USE_CUDA AMREX_SPACEDIM=3

# If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this
# tag can be used to specify a list of macro names that should be expanded. The
Expand Down
4 changes: 4 additions & 0 deletions Docs/sphinx_documentation/source/BuildingAMReX.rst
Original file line number Diff line number Diff line change
Expand Up @@ -493,6 +493,10 @@ The list of available options is reported in the :ref:`table <tab:cmakevar>` bel
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
| AMReX_DIFFERENT_COMPILER | Allow an app to use a different compiler | NO | YES, NO |
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
| AMReX_INSTALL | Generate Install Targets | YES | YES, NO |
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
| AMReX_PROBINIT | Enable support for probin file | Platform dependent | YES, NO |
+------------------------------+-------------------------------------------------+-------------------------+-----------------------+
.. raw:: latex

\end{center}
Expand Down
29 changes: 29 additions & 0 deletions Docs/sphinx_documentation/source/GPU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -807,6 +807,35 @@ gpu kernels use its memory. With :cpp:`Elixir`, the ownership of the
memory is transferred to :cpp:`Elixir` that is guaranteed to be
async-safe.

Async Arena
-----------

CUDA 11.2 has introduced a new feature, stream-ordered CUDA memory
allocator. This feature enables AMReX to solve the temporary memory
allocation and deallocation issue discussed above using a memory pool.
Instead of using :cpp:`Elixir`, we can write code like below,

.. highlight:: c++

::

for (MFIter mfi(mf); mfi.isValid(); ++mfi) {
const Box& bx = mfi.tilebox();
FArrayBox tmp_fab(bx, numcomps, The_Async_Arena());
Array4<Real> const& tmp_arr = tmp_fab.array();
FArrayBox tmp_fab_2;
tmp_fab_2.resize(bx, numcomps, The_Async_Async());

// GPU kernels using the temporary
}

This is now the recommended way because it's usually more efficient than
:cpp:`Elixir`. Note that the code above works for CUDA older than 11.2, HIP
and DPC++ as well, and it's equivalent to using :cpp:`Elixir` in these
cases. By default, the release threshold for the memory pool is unlimited.
One can adjust it with :cpp:`ParmParse` parameter,
``amrex.the_async_arena_release_threshold``.

.. _sec:gpu:launch:

Kernel Launch
Expand Down
5 changes: 3 additions & 2 deletions Docs/sphinx_documentation/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,9 @@ We are always happy to have users contribute to the AMReX source code. To
contribute, issue a pull request against the development branch (details `here
<https://help.github.com/articles/creating-a-pull-request/>`_). Any level of
changes are welcomed: documentation, bug fixes, new test problems, new solvers,
etc. To obtain help, simply post an
`issue <https://github.com/AMReX-Codes/amrex/issues>`_
etc. To obtain help, simply post a
`discussion <https://github.com/AMReX-Codes/amrex/discussions>`_
or an `issue <https://github.com/AMReX-Codes/amrex/issues>`_
on the AMReX GitHub webpage.

There are small stand-alone example codes that demonstrate how to use different parts of the AMReX functionality;
Expand Down
4 changes: 4 additions & 0 deletions Src/Amr/AMReX_Amr.H
Original file line number Diff line number Diff line change
Expand Up @@ -329,8 +329,10 @@ protected:
//! Initialize grid hierarchy -- called by Amr::init.
void initialInit (Real strt_time, Real stop_time,
const BoxArray* lev0_grids = 0, const Vector<int>* pmap = 0);
#ifndef AMREX_NO_PROBINIT
//! Read the probin file.
void readProbinFile (int& init);
#endif
//! Check for valid input.
void checkInput ();
//! Restart from a checkpoint file.
Expand Down Expand Up @@ -445,7 +447,9 @@ protected:
int sub_cycle;
std::string restart_chkfile;
std::string restart_pltfile;
#ifndef AMREX_NO_PROBINIT
std::string probin_file;
#endif
LevelBld* levelbld;
bool abort_on_stream_retry_failure;
int stream_max_tries;
Expand Down
20 changes: 17 additions & 3 deletions Src/Amr/AMReX_Amr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@ namespace
//
int plot_nfiles;
int mffile_nstreams;
#ifndef AMREX_NO_PROBINIT
int probinit_natonce;
#endif
bool plot_files_output;
int checkpoint_nfiles;
int regrid_on_restart;
Expand Down Expand Up @@ -111,7 +113,9 @@ Amr::Initialize ()
Amr::first_smallplotfile = true;
plot_nfiles = 64;
mffile_nstreams = 1;
#ifndef AMREX_NO_PROBINIT
probinit_natonce = 512;
#endif
plot_files_output = true;
checkpoint_nfiles = 64;
regrid_on_restart = 0;
Expand Down Expand Up @@ -262,9 +266,11 @@ Amr::InitAmr ()
pp.query("compute_new_dt_on_regrid",compute_new_dt_on_regrid);

pp.query("mffile_nstreams", mffile_nstreams);
pp.query("probinit_natonce", probinit_natonce);

#ifndef AMREX_NO_PROBINIT
pp.query("probinit_natonce", probinit_natonce);
probinit_natonce = std::max(1, std::min(ParallelDescriptor::NProcs(), probinit_natonce));
#endif

pp.query("file_name_digits", file_name_digits);

Expand Down Expand Up @@ -302,12 +308,14 @@ Amr::InitAmr ()
setRecordDataInfo(i,datalogname[i]);
}

#ifndef AMREX_NO_PROBINIT
probin_file = "probin"; // Make "probin" the default

if (pp.contains("probin_file"))
{
pp.get("probin_file",probin_file);
}
#endif
//
// If set, then restart from checkpoint file.
//
Expand Down Expand Up @@ -1145,6 +1153,7 @@ Amr::init (Real strt_time,
BL_PROFILE_REGION_STOP("Amr::init()");
}

#ifndef AMREX_NO_PROBINIT
void
Amr::readProbinFile (int& a_init)
{
Expand Down Expand Up @@ -1231,6 +1240,7 @@ Amr::readProbinFile (int& a_init)
if (verbose > 0)
amrex::Print() << "Successfully run amrex_probinit\n";
}
#endif

void
Amr::initialInit (Real strt_time,
Expand Down Expand Up @@ -1266,11 +1276,13 @@ Amr::InitializeInit(Real strt_time,
//
// Init problem dependent data.
//
int linit = true;

#ifndef AMREX_NO_PROBINIT
if (!probin_file.empty()) {
int linit = true;
readProbinFile(linit);
}
#endif

cumtime = strt_time;
//
Expand Down Expand Up @@ -1375,11 +1387,13 @@ Amr::restart (const std::string& filename)
//
// Init problem dependent data.
//
int linit = false;

#ifndef AMREX_NO_PROBINIT
if (!probin_file.empty()) {
int linit = false;
readProbinFile(linit);
}
#endif

//
// Start calculation from given restart file.
Expand Down
13 changes: 13 additions & 0 deletions Src/Base/AMReX_Arena.H
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ inline bool is_aligned (const void* p, std::size_t alignment) noexcept
class Arena;

Arena* The_Arena ();
Arena* The_Async_Arena ();
Arena* The_Device_Arena ();
Arena* The_Managed_Arena ();
Arena* The_Pinned_Arena ();
Expand Down Expand Up @@ -88,6 +89,18 @@ public:
* \brief A pure virtual function for deleting the arena pointed to by pt
*/
virtual void free (void* pt) = 0;

// isDeviceAccessible and isHostAccessible can both be true.
virtual bool isDeviceAccessible () const;
virtual bool isHostAccessible () const;

// Note that isManaged, isDevice and isPinned are mutually exlusive.
// For memory allocated by cudaMalloc* etc., one of them returns true.
// Otherwise, neither is true.
virtual bool isManaged () const;
virtual bool isDevice () const;
virtual bool isPinned () const;

/**
* \brief Given a minimum required arena size of sz bytes, this returns
* the next largest arena size that will align to align_size bytes
Expand Down
Loading

0 comments on commit 5b6f1ec

Please sign in to comment.