-
Notifications
You must be signed in to change notification settings - Fork 71
API Design
The design of River Trail is based on three pillars:
- a type called
ParallelArray
that holds data values - several operations on
ParallelArray
s that implement parallel constructs, likemap
- the concept of an elemental function which is passed to the operations and typically returns a single data element
We have chosen a set of operations on ParallelArray
s that we feel is minimal and upon which other data-parallel constructs can be built. For example, sum
could be implemented using reduce
, while prefix-sum
could be implemented using scan
. We anticipate useful libraries and infra-structure being built upon our constructs. This approach enables a "do few things well" implementation strategy while ensuring the composability needed to build other constructs.
We add ParallelArray
, a new data type, to JavaScript. It is a read-only data structure holding the actual data that is created by a call to a constructor or is returned from a call to one of the ParallelArray
prototype methods. Input to the constructor is typically a JavaScript Array
, a TypedArray
, or a function that is used to generate the ParallelArray
's values. For example, new ParallelArray([1,2,3])
would create a ParallelArray
holding the values 1, 2, and 3.
We also support comprehension, so the same ParallelArray
could be created using a function that generates the values. An example is new ParallelArray(3, function() { return this+1; })
. The first argument is the length of the freshly minted array, and this
inside the elemental function is the value of the index in the result that receives the returned value. All the data values are eagerly realized, so one can't use this mechanism to create infinitely large arrays. The result would be a ParallelArray
holding the values 1, 2, and 3.
Similar in spirit to the use of kernel functions in OpenCL, our approach makes use of elemental functions which are written in JavaScript. Any JavaScript function with an appropriate signature can be used as an elemental function. It is the responsibility of the River Trail system to determine if the elemental function can be optimized to take advantage of the data-parallel hardware. If optimization is not possible, the function is simply executed sequentially.
It is the responsibility of the software development environment to assist the developer in determining whether the elemental function can be optimized. There are of course a few hints that will serve the programmer well. Assignment of either a global or free variable will defeat optimization. This is because such assignments are inherently non-deterministic, and we want deterministic execution.
Recursion is another example of a typical JavaScript construct that is likely to defeat optimization. Of course, as the system matures, it is possible that tail-recursive kernels can be converted into loops and allowed. Pretty much any ParallelArray
that does not hold a TypedArray
or that cannot be turned into a TypedArray
is unlikely to be optimized. So far we have not seen meaningful use cases where this is a significant restriction.
It is important to note that if something cannot be optimized does not mean that it won't run in all browsers, just that it will run more slowly. We could have chosen to not run the code at all and throw an exception, but we decided that the culture of JavaScript dictated that we make a best-effort attempt at running all code. Of course the development environment would provide feedback to the programmer as to why a kernel could not be optimized.
JavaScript provides the object-oriented concept of a this
variable. Within the elemental functions we are careful to define this
in appropriate ways. For example, for map
, this
is the element of the ParallelArray
used to calculate the corresponding element in the resulting array. For other elemental functions, this
would be the entire ParallelArray
.
ParallelArray
comes with the following five data-parallel methods: map
, combine
, scan
, filter
, and scatter
. Each of these methods, when combined with elemental functions, creates and returns a new ParallelArray
. ParallelArray
's sixth core data-parallel method is reduce
, which can return a scalar value. With these six methods, one can create a very robust and complete data-parallel library. ParallelArray
also provides non-data-parallel methods, such as length
, that are similar to their sister methods from the Array
class which we will not touch upon here.
Finally we provide a get
method that takes an index into a ParallelArray
and returns the value located at that index. We provide multi-dimensional arrays and related methods to query things like the number of dimensions. We do not provide a set
method or other destructive array methods such as push
, pop
, shift
or unshift
. The syntax in our prototype does not include the binary operator [ ]
, to avoid confusion when it is used to set a location in a ParallelArray
.
Finally, we expect developers to use JavaScript prototype features to extend the core River Trail methods in appropriate ways. A complete library could define a reasonable sum
method using reduce
, for instance.