feat: support conditional expressions in vectorized map #5017

onelson · 2022-07-23T00:35:33Z

Adds support for vectorized functions containing conditional expressions.
This new code path is guarded by a feature flag called vectorizedConditionals, enabled in test but off by default.

In cases where the test (ie the condition part of the expression) is all one value or another, we can skip a lot of the new code by returning either the the consequent or alternate vector directly.

For other situations, both the consequent and alternate vectors must be evaluated in their entirety after which values are selected for return based on the test value for the given index.
N.b. this could give an inconsistent outcome for cases where these expressions have some sort of side-effect when the when compared to the non-vectorized version. At this time, such expressions would prevent the function from being vectorized in the first place, but we may need to revisit this detail in the future.

The flux acceptance tests pass with or without the flag enabled, but in the future when #4739 lands, these testcases should be updated to assert the vectorized code path is actually being hit.

Go unit tests verify the behavior in a more isolated way.

Done checklist

docs/SPEC.md updated N/A
Test cases written

compiler/vectorized_test.go

OfTheDelmer

It looks great! Just adding comments from our dive into IsValid vs IsNull.

array/conditional.gen.go.tmpl

onelson · 2022-07-28T17:10:45Z

@OfTheDelmer thanks for your notes. I'll do another pass on the null handling, hopefully with some new tests to verify that:

when test is null it is considered false
when either consequent or alternate are null
- ... and they are the selected branch
- we produce a null in the output vector

I think the blocks called out in the review may simplify further with this contract in mind.

onelson · 2022-07-28T18:41:44Z

Tests added in 2afaa99 for null inputs in each of the three vectors appear to be passing as-is, so that's something.

Working to simplify the checks now.

onelson · 2022-07-28T19:25:12Z

@OfTheDelmer I've simplified the null checks in 65ac960 -- if these changes seem right to you, can you resolve the remarks in your review?

Updates the Go side to support vectorized conditionals. Adds new template-generated code to handle the arrow array building code as well as updates to the `values` package which acts as an entry point to help delegate to the correct typed API based on the MonoType of the incoming vectors.

Currently the Go unit tests that actually process inputs are failing due to a type conflict on the `test` portion of the expression, ex: `bool != v[bool]`. Not sure why I didn't bump into this earlier -- manual testing via scripts all seemed to work fine as is. Watching in the debugger showed execution hitting all the expected (vectorized) code paths. The "Logical Ops" PR added a `typ` field to the expression node to allow us to alter the monotype during vectorization, so I'll do the same and see if that fixes things.

Fixes an issue where the tests for vectorized conditionals broke with a type conflict on the `test` part of the expression (`bool` vs `v[bool]`). For whatever reason, it seems like we may have been running the row-based version of the function prior to this diff. Now, when a test case sets `vectorizable` to `true`, we replace the function expression with the vectorized version before passing it to `Compile`.

The test cases try to cover the range of types we support, while also targeting constants in the consequent/alternate positions to verify vec repeat plays well. Right now, the tests exercising vec repeat fail! Fixes pending.

Already covered were cases where `test` was const, as well as when EITHER `consequent` OR `alternate` were const, but not BOTH. For cases where all 3 are const, the receiver of of the return value of the conditional expression will get the vec repeat from whichever branch was selected (this case was already handled). The panic from the type conflict between Time/Int is still to resolve.

Adds a new flux feature flag, propagating it down into the rust analyzer code so we can use it to activate or disable the new code path. Tests have been updated to enable the flag, but it is off by default.

Lightly modified version of work done in this (currently draft) PR: #5015

OfTheDelmer

This update looks great

array/conditional.gen.go.tmpl

OfTheDelmer · 2022-07-29T20:15:39Z

values/vector_values.gen.go

@@ -81,47 +87,71 @@ func newVectorFromSlice(values []Value, typ semantic.MonoType, mem memory.Alloca
 	case semantic.BasicInt:
 		b := arrow.NewIntBuilder(mem)
 		for _, v := range values {
-			b.Append(v.Int())
+			if v.IsNull() {


Initial work for vectorized conditionals neglected to cover cases where the inputs for `test`, `consequent`, and `alternate` could be invalid (ie, referencing fields which could not be statically verified and missing at runtime). This diff aims to handle these cases by: - treating a null `test` the same as "all false" by returning `alternate` directly. - changing the constant (vec repeat) code paths to accept pointers to the various primitive types so they can be optionally null. - short circuiting for cases where both `consequent` and `alternate` are null (since the output will be null, regardless of the values in `test`). The choice to adjust the constant handling code path to also handle the null case was largely motivated by wanting to avoid adding yet another factor to drive combinations of function signatures. This would have required a large expansion of functions in the `array` package, as well as higher complexity to be able to delegate to the new functions from the `values` side. We stated with needing (per type): - vec, const - const, vec - vec, vec - const, const By going this way, we avoid adding (per type): - vec, null - null, vec - const, null - null, const Furthermore, the logic required for nulls in either position is more or less identical to the const pattern, with the addition of a null check of the pointer. Refs: - <#4601> - <#5017>

…est, consequent or alternate (#5118) * test: check for nulls in vectorized conditionals Nulls need special care for vectorized conditionals, similar to how we manage constants. If the test, consequent or alternate reference fields that don't exist then we need to act as if we have a "constant null" for the operation. * fix: handle null test, consequent, alternate in vectorized conditional Initial work for vectorized conditionals neglected to cover cases where the inputs for `test`, `consequent`, and `alternate` could be invalid (ie, referencing fields which could not be statically verified and missing at runtime). This diff aims to handle these cases by: - treating a null `test` the same as "all false" by returning `alternate` directly. - changing the constant (vec repeat) code paths to accept pointers to the various primitive types so they can be optionally null. - short circuiting for cases where both `consequent` and `alternate` are null (since the output will be null, regardless of the values in `test`). The choice to adjust the constant handling code path to also handle the null case was largely motivated by wanting to avoid adding yet another factor to drive combinations of function signatures. This would have required a large expansion of functions in the `array` package, as well as higher complexity to be able to delegate to the new functions from the `values` side. We stated with needing (per type): - vec, const - const, vec - vec, vec - const, const By going this way, we avoid adding (per type): - vec, null - null, vec - const, null - null, const Furthermore, the logic required for nulls in either position is more or less identical to the const pattern, with the addition of a null check of the pointer. Refs: - <#4601> - <#5017> * chore: use sentinel Null instead of New(nil) * test: add planner assertion to new vectorized tests * chore: make generate

onelson force-pushed the onelson/feat/vectorize-conditional-exprs branch 2 times, most recently from bd0516a to e84f58b Compare July 25, 2022 18:49

onelson changed the title ~~Onelson/feat/vectorize conditional exprs~~ feat: support conditional expressions in vectorized map Jul 25, 2022

OfTheDelmer reviewed Jul 26, 2022

View reviewed changes

compiler/vectorized_test.go Show resolved Hide resolved

onelson marked this pull request as ready for review July 26, 2022 19:39

onelson requested a review from a team as a code owner July 26, 2022 19:39

onelson requested review from OfTheDelmer and removed request for a team July 26, 2022 19:39

OfTheDelmer reviewed Jul 28, 2022

View reviewed changes

array/conditional.gen.go.tmpl Outdated Show resolved Hide resolved

array/conditional.gen.go.tmpl Outdated Show resolved Hide resolved

array/conditional.gen.go.tmpl Outdated Show resolved Hide resolved

onelson force-pushed the onelson/feat/vectorize-conditional-exprs branch from 740a622 to 2afaa99 Compare July 28, 2022 18:36

onelson force-pushed the onelson/feat/vectorize-conditional-exprs branch from 0da2222 to 65ac960 Compare July 28, 2022 19:21

onelson requested a review from OfTheDelmer July 28, 2022 19:25

Owen Nelson added 15 commits July 28, 2022 13:20

feat: allow cond expressions to vectorize per semantic graph

192ef8d

chore: make generate

accedfa

test: add acceptance tests for vectorized conditionals

5a1b935

The test cases try to cover the range of types we support, while also targeting constants in the consequent/alternate positions to verify vec repeat plays well. Right now, the tests exercising vec repeat fail! Fixes pending.

fix: explicitly cast Time to Int when handling vec repeats

4519a0b

chore: comment cleanup

9371d71

chore: make generate

d013c6d

refactor: add vectorizedConditionals feature flag

da41dbf

Adds a new flux feature flag, propagating it down into the rust analyzer code so we can use it to activate or disable the new code path. Tests have been updated to enable the flag, but it is off by default.

chore: make generate

ebaf56b

refactor: allow nils in vectorized test case inputs

c88dc69

Lightly modified version of work done in this (currently draft) PR: #5015

test: add test cases for null test, consequent, alternate

086f8fe

chore: tidy up tmpl

e353020

refactor: simplify null checks for test, consequent, alternate

5fae644

onelson force-pushed the onelson/feat/vectorize-conditional-exprs branch from 65ac960 to 5fae644 Compare July 28, 2022 20:20

OfTheDelmer approved these changes Jul 29, 2022

View reviewed changes

onelson merged commit 1715d9d into master Jul 29, 2022

onelson deleted the onelson/feat/vectorize-conditional-exprs branch July 29, 2022 20:22

onelson mentioned this pull request Aug 20, 2022

fix: "invalid != vector" panic for conditionals with bad values for test, consequent or alternate #5118

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support conditional expressions in vectorized map #5017

feat: support conditional expressions in vectorized map #5017

onelson commented Jul 23, 2022 •

edited

Loading

OfTheDelmer left a comment

onelson commented Jul 28, 2022

onelson commented Jul 28, 2022

onelson commented Jul 28, 2022

OfTheDelmer left a comment

OfTheDelmer Jul 29, 2022

feat: support conditional expressions in vectorized map #5017

feat: support conditional expressions in vectorized map #5017

Conversation

onelson commented Jul 23, 2022 • edited Loading

Done checklist

OfTheDelmer left a comment

Choose a reason for hiding this comment

onelson commented Jul 28, 2022

onelson commented Jul 28, 2022

onelson commented Jul 28, 2022

OfTheDelmer left a comment

Choose a reason for hiding this comment

OfTheDelmer Jul 29, 2022

Choose a reason for hiding this comment

onelson commented Jul 23, 2022 •

edited

Loading