Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dynamic): add dynamic.asArray builtin #5239

Merged
merged 15 commits into from
Oct 3, 2022
Merged

Conversation

onelson
Copy link
Contributor

@onelson onelson commented Sep 28, 2022

Fixes: #5142

This diff adds a new builtin to the experimental/dynamic package which converts a dynamic value into an array of dynamic elements.

In order to test this, I had to get a little creative and added an additional builtin called dynamic._isNotDistinct which uses the Value.Equal method display() function to compare 2 dynamics. This was made necessary by the lack of other cast functions (still pending implementation) and the fact dynamic is not permitted as table column data. This means we must convert dynamic values to something else before we can assert via testing.diff.

While testing, I ran into issue with the initial type-system work for dynamic (#5212); multiple dynamic values could not be used together because of a missing bit (they failed to unify). A small test was added at the type-system level, and the new rule added. This allowed tests like the "heterogeneous array" one to be written.

Some small refactors were also made to dynamic.dynamic which landed in #5242. Specifically it was updated to not rewrap values which were already dynamic. The original rewrap behavior caused me a confusion earlier in this PR, though I don't think you'd see evidence of it at this stage.

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

@onelson onelson changed the title feat: add dynamic.asArray and dynamic.dynamic builtins feat: add dynamic.asArray builtin Sep 29, 2022
@onelson onelson changed the base branch from master to onelson/feat/dynamic-wrapper September 29, 2022 18:01
@onelson onelson force-pushed the onelson/feat/dyn-as-array branch from e2a17cd to b94e54a Compare September 29, 2022 18:01
stdlib/experimental/dynamic/dynamic.flux Outdated Show resolved Hide resolved
stdlib/experimental/dynamic/dynamic.flux Outdated Show resolved Hide resolved
Base automatically changed from onelson/feat/dynamic-wrapper to master September 29, 2022 19:08
@onelson onelson force-pushed the onelson/feat/dyn-as-array branch 4 times, most recently from e17f3ca to ecb47a1 Compare September 29, 2022 23:56
@onelson onelson changed the title feat: add dynamic.asArray builtin feat(dynamic): add dynamic.asArray builtin Sep 30, 2022
@onelson onelson marked this pull request as ready for review September 30, 2022 00:12
@onelson onelson requested review from a team as code owners September 30, 2022 00:12
@onelson onelson requested review from scbrickley and lwandzura and removed request for a team September 30, 2022 00:12
Copy link
Contributor

@sanderson sanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, but I had questions about whether or not asArray should autowrap its elements.


// This is similar to the "actual array" test except that the elements in the
// array are not wrapped in dynamic ahead of time. asArray therefore needs to
// wrap the elements prior to producing the `[dynamic]` it'll return.
Copy link

@wolffcm wolffcm Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect it to be a bug for dynamic.jsonParse to produce a dynamic array whose elements weren't also wrapped. Maybe asArray should assert in this case?

On the other hand, I can also imagine other sources of dynamic data that, unlike JSON, are able to guarantee homogeneity of arrays. In that case, maybe it might make sense for asArray to leave the elements themselves unwrapped and produce a [int] or whatever?

I guess what I'm saying is that it seems like auto-wrapping in asArray could cover up bugs elsewhere in the stdlib (if we're being strict about wrapping), or it could be making things needlessly difficult (if we're not). Thoughts?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking some more about this, the signature of asArray is (<-v: dynamic) => [dynamic] so it needs to produce a [dynamic]. I think asArray should produce an error rather than wrap---but still not 100%.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been mulling this over - the problem of who's responsibility is is to wrap.

From my side I was trying to weigh whether or not the wrapping should happen in the initial dynamic() call when it receives an [int]. I opted to do the check on the toArray side to defer the extra work of wrapping each element until it would be needed.

If we say that toArray returns [A] instead of [dynamic] the extra wrapping could be skipped entirely for the homogeneous case but we sacrifice enforcement of the static types matching reality, I think.

The one thing I can say is if the caller sends in an [int] I would like for them to not need to manually wrap each element themselves in the script. That would be punitive.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we say that toArray returns [A] instead of [dynamic] the extra wrapping could be skipped entirely for the homogeneous case but we sacrifice enforcement of the static types matching reality, I think.

I'm agreed on this. In the past we've talked about injecting asserting casts into the AST or semantic graph in places where we're getting data from external sources to avoid having to do typechecks everywhere at runtime. But that's never come to fruition.

It seems like you're suggesting that we can think of dynamic(v: ...) as producing a dynamic value where any collections also have their values wrapped with dynamic, but that as an implementation detail we are not wrapping them. We are wrapping them lazily upon access as an optimization.

This is the opposite of what dynamic.jsonParse does in my branch. It wraps every field and array element. I really like the idea of there being some symmetry here. if you call dynamic(v: json.parse(...)) and dynamic.jsonParse(data: ...) I would expect them to be equivalent for JSON that doesn't have heterogeneous arrays. So I'm arguing for dynamic(v:...) recursing over its argument to do something simiar as dynamic.jsonParse.

I think it keeps things more manageable if we think of dynamic(v: ...) as producing dynamic data in the same manner as an external source like dynamic.jsonParse. The representation of a dynamic value may be less space efficient and have more indirection, but I think it will make it easier to reason about what's going on. We can decide to optimize later if we like.

I keep going back and forth on this in my mind, though. As always, I'm curious to know your thoughts.

I'm wondering about the use case of dynamic(v: ...). Do we ever expect it to be used outside of testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you're suggesting that we can think of dynamic(v: ...) as producing a dynamic value where any collections also have their values wrapped with dynamic, but that as an implementation detail we are not wrapping them. We are wrapping them lazily upon access as an optimization.

This is precisely what I was thinking. Clearly there are situations where we can't afford to be lazy, specifically the json.parse which needs to establish the element type for arrays it produces as it builds them.

I think I agree that symmetry is desirable and so I think that means adding a recursive structure to dynamic(v: ...). Perhaps there should be some code share between this and the json parser since I imagine the job is effectively the same in both cases.

If this sounds right, I can relocate my naive array wrapping code into dynamic(v:...) as a stop-gap and we can coalesce once both our branches have landed.

I'm wondering about the use case of dynamic(v: ...). Do we ever expect it to be used outside of testing?

The original use case I identified was in building json bodies to send out via http POST.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds right to me. We can change things later on if we need, since I imagine all this will be in experimental for a while.

Copy link
Contributor Author

@onelson onelson Oct 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It struck me that the recursive wrapping might be best done in values.NewDynamic() if the rule is "things that build dynamic values should do so entirely." That'd probably be the best coverage.

I've added a FIXME to mention this is a stop-gap refactor. I filed #5252 for the refactor which should happen once we've both merged our respective PRs.

stdlib/experimental/dynamic/dynamic.go Outdated Show resolved Hide resolved
stdlib/experimental/dynamic/dynamic.go Show resolved Hide resolved
stdlib/experimental/dynamic/dynamic.go Outdated Show resolved Hide resolved
Copy link
Contributor

@scbrickley scbrickley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from what @wolffcm has already pointed out, this looks good to me.

@onelson onelson force-pushed the onelson/feat/dyn-as-array branch 2 times, most recently from 30713d5 to 5f86549 Compare October 1, 2022 00:34
@onelson onelson requested a review from wolffcm October 3, 2022 16:27
Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have some questions about what the behavior of dynamic should be. But maybe that's a part of a follow-up? I wasn't clear on that.

// FIXME(onelson): wrap the value recursively, not just if v is an Array
// We want to produce a dynamic where every single value contained
// within it is also wrapped with a dynamic.
if v.Type().Nature() == semantic.Array {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you be doing something similar for objects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, but I'm calling this out of scope for the asArray issue. I only added an initial impl of the dynamic() builtin to allow me to write tests in the first place 😩

My hope was to address this fully in the follow-up #5252

return nil, err
}
if elmType.Nature() == semantic.Dynamic {
arr.Retain()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these calls to Retain are necessary. My sense is that we started using Retain because code in vectorization can be backed by an Arrow array that needs to be ref-counted. If values cannot be backed by Arrow arrays, which is the case everywhere outside of the vectorizing compiler, they do not need it.

There is a ton of code that makes use of the Values interface that does not retain or release, and it seems like there is no pressing need to change that---maybe this code here should follow suit. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting to learn. My understanding was that Values in flux should be retained if they are passed along in the output, but maybe that's just a special case for vectorization. I've had less exposure to the interpreter side so maybe the rules are different here.

I'll rip these calls out. If we happen to see a panic arise in the coming weeks we'll know better.

for i := 0; i < arr.Len(); i++ {
v := arr.Get(i)
v.Retain()
elems[i] = values.NewDynamic(v)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, this won't wrap the elements of an array inside of another array. But maybe that's part of the FIXME above? Or is that part of the issue you filed? My understanding of that issue is that it's just about sharing code.

Copy link
Contributor Author

@onelson onelson Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you're saying. I was mostly hoping we wouldn't need to both write the same recursive wrapper code then pick a winner. I was more thinking I'd patch in what I needed to make asArray work then we'd refactor to leverage work done in your branch, but move it into NewDynamic so we'd both benefit.

I can build this out for a full implementation if you want, but I wasn't planning on it.

Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm totally fine with saving any further to the dynamic cast for later.

@onelson onelson force-pushed the onelson/feat/dyn-as-array branch from 9bc7311 to 7d72a94 Compare October 3, 2022 20:51
@onelson onelson merged commit fb29718 into master Oct 3, 2022
@onelson onelson deleted the onelson/feat/dyn-as-array branch October 3, 2022 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dynamic.asArray function
4 participants