Skip to content

Commit 570722f

Browse files
committed
update docs to include information in joss paper
1 parent 5296077 commit 570722f

File tree

3 files changed

+78
-0
lines changed

3 files changed

+78
-0
lines changed

doc/sphinx/index.rst

+7
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ slice shown) on a GPU, with second-order convergence:
2424

2525
.. image:: ../../figs/convergence.png
2626

27+
Interpolation is fast and portable. Here's a benchmark showing
28+
performance on CPU and GPU for several architectures and problem
29+
sizes:
30+
31+
.. image:: ../../figs/spiner_interpolation_benchmark.png
32+
2733
See below for details of how to use spiner in your project and how to
2834
develop for it.
2935

@@ -42,6 +48,7 @@ automatically integrated into the build system.
4248
:maxdepth: 1
4349
:caption: Contents:
4450

51+
src/statement-of-need
4552
src/building
4653
src/getting-started
4754
src/databox

doc/sphinx/src/statement-of-need.rst

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
.. _statement-of-need:
2+
3+
Why Develop Spiner?
4+
====================
5+
6+
As Moore's law comes to an end, more and more performance comes from
7+
specialized hardware, such as GPUs. A key tool in the toolbox for many
8+
scientific codes is tabulated data. Fluid and continuum dynamics codes
9+
often encapsulate the equation of state as data tabulated in density
10+
and temperature. Radiation transport uses emissivity and
11+
absorption opacity on tables such as those computed in
12+
[@SullivanWeak]. As continuum dynamics is required for a variety of
13+
applications, such as astrophysics, geophysics, climate science,
14+
vehicle engineering, and national security, utilizing a very large
15+
number of supercomputer cycles, providing tabulated data for these
16+
applications has the potential for significant impact.
17+
18+
These capabilities must be supported on all hardware a code may be run
19+
on, whether this is an NVIDIA GPU, an Intel CPU, or a next generation
20+
accelerator manufactured by one of any number of hardware vendors. To
21+
our knowledge there is no performance portable interpolation library
22+
on which these codes can rely, and there is a clear need, which we
23+
have developed ``Spiner`` to meet.
24+
25+
To see some examples of software projects that leverage ``Spiner`` see
26+
`singularity-EOS`_, `singularity-opac`_, and `Phoebus`_.
27+
28+
.. _singularity-eos: https://github.com/lanl/singularity-eos
29+
30+
.. _singularity-opac: https://github.com/lanl/singularity-eos
31+
32+
.. _Phoebus: https://github.com/lanl/singularity-opac
33+
34+
State of the Field
35+
^^^^^^^^^^^^^^^^^^^
36+
37+
Interpolation is a common problem, implemented countless times across
38+
software projects, and a core part of any introductory text on
39+
scientific computing. In graphics applications interpolation is so
40+
ubiquitous that hardware primitives are provided by GPUs. These
41+
hardware intrinsics are, however, severely limited for scientific
42+
application. For example, on NVIDIA GPUs, the values to be
43+
interpolated must be single precision floating point, and the
44+
interpolation coefficients themselves are only half-precision, which
45+
is often insufficient to capture the high precision required for
46+
scientific applications. As GPUs are inherently vector devices,
47+
hardware interpoaltion is also vectorized in nature. However,
48+
downstream applications may be easier to reason about if scalar
49+
operations are available. For example, equation of state lookups often
50+
require root finds on interpolated data, and this can be easier to
51+
implement as a scalar operation, even if the final operation is
52+
vectorized over warps. Texture interpolation also does not support
53+
multi-dimensional mixed indexing/interpoaltion operations where, say,
54+
three indices of a four-dimensional array are interpolated and one is
55+
merely indexed into.
56+
57+
Moreover, relying on hardware intrinsics is not a portable solution. A
58+
software interpolation library can, if written with care, work on not
59+
only the current generation of accelerators, but also on general
60+
purpose CPUs and the next generation of hardware as well.
61+
62+
Unfortunately, a performance-portable implementation not tuned to a
63+
specific use-case or embedded in a larger project is (to our
64+
knowledge) not available in the literature. A common problem in
65+
performance-portable computing is the management of
66+
performance-portable data structures.
67+
68+
Interpolation is far more ubiquitous than its application in continuum
69+
dynamics and radiation transport, and we expect Spiner will find
70+
applications in the broader space of applications, such as image
71+
resampling. However, the team built Spiner with simulations in mind.
125 KB
Loading

0 commit comments

Comments
 (0)