Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try in Native Python #76

Merged
merged 2 commits into from
Dec 14, 2018
Merged

Try in Native Python #76

merged 2 commits into from
Dec 14, 2018

Conversation

saulshanabrook
Copy link
Contributor

This adds an implementation using Python interfaces and python functions, instead of using MatchPy.

The array algorithms are type dispatching functions with a default implementation that works with generic arrays.

The idea was that since we are writing in Python, let's not reinvent what it means to be a function and instead just write our functions imperatively in Python. So instead of writing things like If(LessThan(x, Int(2)), ...) we write if x < 2: .... We restrict ourselves to a subset of Python types, Arrays, Naturals' (see ./uarray/native/typing.py for their definitions), and operations on integers. Then, given those types we build up MoA and NumPy semantics with functions on those types.

This approach is a lot easier to write and follow than the other of writing symbolic expressions, because we don't have to invent a new language/world to live in, we just use the Python one.
"But how would we use this code to compile array algorithms to xxx (C, LLVM, Tensorflow)?" is the obvious question though. The reason I stayed away from doing things this way, was that then you have these opaque Python functions with control flow in them, and if you want to compile that to another backend then you have to somehow interpret those. Whereas, if you build everything up in MatchPy, you limit what is possible from the get-go, you don't get the freedom of the Python language. So yes, for this approach to be performant in any way, we would have to use/build something that could take these Python functions and re-interpret them as array expressions. Something a little like Numba, but also very different in scope.

From a high level, to compile/optimize these types of expressions would require at least three things:

  1. Abstraction removal/function inlining: All array functions and indexing functions should be inlined, so that you have just have one large piece of code without any dependence any python classes/functions besides the array operations and number math.
  2. Constant folding/partial evaluation: Often, there are multiple code paths depending on the dimensionality of the inputs. These should be evaluated at compile time, when possible, so we need to be able to evaluate things at compile time that can be known and replace control flow that we can also know about it (i.e. if we have an if statement and know it's condition, then we can replace it)
  3. All of that is fine, and wil help to produce some cleaner Python code, but then at some point we actually need to convert that to whatever compilation target we care about. For example, any unknown control flow (I am thinking mostly here of if) need to be compiled to the target language, like LLVM.

Pros of this approach:

  • Separates Python implementations of array algorithms from how to compile them
  • Integrates nicely with proposed array protocol
  • Allows us to write array algorithms in native python

Cons of this approach:

  • Need to implement python bytecode or AST parser and transformer. This is more work than starting in the abstract array indexing form we care about.
    • This also ties us closer to the Python implementation. So it might be harder to upgrade Python versions.
  • It's not as easy to constrain the language we are working in. Even if we don't support all of Python, users can still try to use features we don't support and it's hard to prevent that. Yes, we can raise an error, but it's much easier to prevent expressions a MatchPy world, where we get to build what is allowed.

My current recommendation:

  • I have found this useful to implement and I think it has some value at least in learning what primitives we need from this type of system
  • I think work should continue on the matchpy approach, with one eventual outcome being we can translate from this new Python first approach into a symbolic approach.

@saulshanabrook
Copy link
Contributor Author

I am gonna merge this for now, so we can continue playing with it in master. It doesn't change any existing code so we can effectively ignore it unless you import it specifically.

@saulshanabrook saulshanabrook merged commit 2aa2288 into master Dec 14, 2018
@saulshanabrook saulshanabrook deleted the native-python branch December 14, 2018 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant