spec: convergence with Go #20

alandonovan · 2018-12-09T14:55:23Z

The Go implementation has a list of remaining differences from the Java implementation: https://github.com/google/starlark-go/blob/master/doc/spec.md#dialect-differences
I'd like us to finish the wording of a spec that we can all be happy with, even if that spec allows for some differences among implementations. I'll go through the list of differences point by point:

multiprecision integers: the spec should require that integer precision be sufficient to represent uint64 and int64 values without loss, as these are required for correct handling of protocol buffers, among other things. Obviously Bazel has no need for larger integers so it would be fine not to implement it for now, but it should be described as a limitation of the implementation.
floating point: for the same reason, lossless handling and arithmetic on float64 values must also be supported. (On this and the above point I think we were all agreed based on a meeting in NYC about 18 months ago.) Bazel has no need of floating-point at all, so again, we can state that this is a limitation of the Java implementation.
bitwise operators should be supported. They are fundamental operations on integers in every machine and programming language. Bazel may not need them, but many other uses do (anything that uses protocol buffers, for example.)
strings: we cannot realistically require a particular string encoding (UTF-8 or UTF-16) without imposing intolerable costs on implementations whose host language uses the opposite encoding. I propose we specify strings in terms of code units without specifying the encoding; UTF-8 and UTF-16 are only quantitatively different in that sense. However this does leave the Java implementation without a data type capable of representing binary data.
strings should have methods elem_ords, codepoint_ords, and codepoints. I think there was agreement on this point but the Java implementation was lagging.
A language needs some way to encode a Unicode code point as a string (and vice versa). One way to do this is the Go impl's chr and ord built-in functions. (Related: the "%c" formatting operator, which is like "%s" % chr(x).)
The Go impl permits 'x += y' rebindings at top level. I think it should probably match the Bazel implementation (which rejects them), but the whole no-global-reassign feature should be specified as a dialect option, since no client other than Bazel wants it.
The Go implementation treats assert as a valid identifier. Indeed, it uses it widely throughout its own tests. The cost of specifying this would be that tools (such as Bazel tests) that use the Python parser will not be able to parse Starlark files that use 'assert' as an identifier. Given that using Python in this way is a hack, and that files containing assert will be vanishingly rare in the Bazel test suite, that doesn't seem like a problem.
The Go impl's parser accepts unary + expressions for parity with every other ALGOL-like language. A + operator forces a check that its operand is numeric, and occasionally makes code more readable. I think the spec should include it.
In the Go impl, a method call x.f() may be separated into two steps: y = x.f; y(). I think work is underway to support this in the Java impl too. I recall we were at least agreed it was the right thing.
In the Go impl, dot expressions may appear on the left side of an assignment: x.f = 1. This is a parser issue---in Bazel, there are no mutable struct-like data types for which this operation would succeed, but other applications may need it (esp. if they use protocol buffers), so the grammar should support it nonetheless.
In the Go impl, the hash function accepts operands besides strings, as in Python. It should be an easy fix to the Java implementation to do so too.
The Go impl's sorted function accepts the additional parameters key and reverse. These make it easier to define alternative order without the effort and unnecessary allocation of the decorate/sort/undecorate trick and a separate call to reverse.
The Go impl's type(x) returns "builtin_function_or_method" for built-in functions. This is the string Python uses. I don't have a strong feeling about the particular string, but the crucial thing is that builtin- and Starlark-defined functions must have distinct types because they support different operations. For example, in Bazel, the rule.outputs mechanism requires that its operand be a Starlark function so that its parameter names can be retrieved; this is impossible with a built-in function.

The text was updated successfully, but these errors were encountered:

kastiglione · 2018-12-09T17:27:12Z

Given that using Python in this way is a hack

Is there more context on this? I've found it exceedingly convenient to use Python's ast module to query over and do transformations of Bazel files, for small development tasks.

alandonovan · 2018-12-09T19:39:42Z

Is there more context on this? I've found it exceedingly convenient to use Python's ast module to query over and do transformations of Bazel files, for small development tasks.

The key word in this sentence is "convenient". :)

If you want to transform a Starlark program from Python, the right thing to do is write a Starlark parser in Python. It should be easy because you can just fork the Python parser and delete the parts you don't need.

The syntax of Starlark is, for now, a subset of Python, but longer term it could be improved by breaking compatibility. The most glaring problem is the syntax for load, which must use strings where identifiers are wanted. Tools that assume a Python parser is sufficient are taking an expedient short-cut at the expense of long term maintainability, which is the definition of a hack. The Bazel test tools I was alluding to go one step further and actually execute the Skylark program in a Python interpreter, which is very fragile indeed.

adonovan · 2019-01-23T02:45:20Z

I met with Laurent and Damien today and we agreed on the following spec changes:

Floating point literals, values, arithmetic, and the float built-in should be an optional feature behind a dialect flag. Implementations that support it should use float64 semantics. Laurent was concerned that float-to-string conversion is hard to specify and may vary by implementation language; I agree but don't see that as a particular problem.
Integer bitwise operations (int&int, int|int, int^int, ~int, int<<int, and int>>int) should be supported in all implementations, without a dialect flag.
The unary +int operation should be defined in all implementations.
String operations should be specified in terms of code units (UTF-8 bytes in Go, UTF-16 chars in Java).
The Go implementation should reject x+=y at top-level if it would reject x=x+y, as the Java implementation does.
It should be possible to call methods in two steps y = x.f; y(). The Java impl doesn't yet support it; that's a bug.
The parser should accept x.f = y. Currently the Java parser rejects it. (There are no datatypes in Bazel for which this statement can execute without error, but that is not a reason for the parser to reject it.)
hash(x) should be defined only for strings, with the same algorithm across implementations, to ensure predictable ordering execution across tools that, say, process Bazel BUILD files. (Although many types of values are hashable, the dict data type doesn't expose the hash values or their ordering.) We need to agree on a cheap simple hash function. AD proposes FNV32. https://golang.org/src/hash/fnv/fnv.go?s=1100:1124#L32
Glenn (the F in FNV) was present and concurred. :)
The sorted function should support the key and reverse parameters as these increase flexibility and efficiency. Although lambdas are syntactically convenient for key, the key function does not typically close over variables.

...unless -globalreassign is set. Formerly, the Go implementation permitted x+=y on the basis that we couldn't statically tell whether x was a list (in which case x+=y means x.extend(y), which is not a rebinding), or some other type, in which case it is a forbidden rebinding. This change makes it match the Java implementation. See bazelbuild/starlark#20 (comment) Most non-Bazel-like clients of Starlark set the -globalreassign flag. Change-Id: I1a21fc6871e51c201529da91dc0a46ad3e99d448

...unless -globalreassign is set. Formerly, the Go implementation permitted x+=y on the basis that we couldn't statically tell whether x was a list (in which case x+=y means x.extend(y), which is not a rebinding), or some other type, in which case it is a forbidden rebinding. This change makes it match the Java implementation. See bazelbuild/starlark#20 (comment) Most non-Bazel-like clients of Starlark set the -globalreassign flag. Change-Id: I9bb508255dcf66594a025df01db82901fc7742cc

...unless -globalreassign is set. Formerly, the Go implementation permitted x+=y on the basis that we couldn't statically tell whether x was a list (in which case x+=y means x.extend(y), which is not a rebinding), or some other type, in which case it is a forbidden rebinding. This change makes it match the Java implementation. See bazelbuild/starlark#20 (comment) Most non-Bazel-like clients of Starlark set the -globalreassign flag. Change-Id: Ife0df4776da27762dbaf34228f926caf7777c2aa

...unless -globalreassign is set. Formerly, the Go implementation permitted x+=y on the basis that we couldn't statically tell whether x was a list (in which case x+=y means x.extend(y), which is not a rebinding), or some other type, in which case it is a forbidden rebinding. This change makes it match the Java implementation. See bazelbuild/starlark#20 (comment) Most non-Bazel-like clients of Starlark set the -globalreassign flag. Change-Id: I5477c46c3d4e00dad419528ee2953b8bd6b47a26

See bazelbuild/starlark#20 (comment) Change-Id: I595f87b9b777e79065684ee2c29acb6435b232cf

...unless -globalreassign is set. Formerly, the Go implementation permitted x+=y on the basis that we couldn't statically tell whether x was a list (in which case x+=y means x.extend(y), which is not a rebinding), or some other type, in which case it is a forbidden rebinding. This change makes it match the Java implementation. See bazelbuild/starlark#20 (comment) Most non-Bazel-like clients of Starlark set the -globalreassign flag.

See bazelbuild/starlark#20 (comment)

Implements: bazelbuild/starlark#20 (comment) Closes #8890. PiperOrigin-RevId: 258388687

Implement:bazelbuild/starlark#20 (comment) Closes #8881. PiperOrigin-RevId: 258401453

…sing getattr Related: bazelbuild/starlark#20 (comment), #5224 It's now possible to call methods in two steps `y = x.f; y()` Also, `getattr` can now be used to retrieve built-in methods. Closes #8931. PiperOrigin-RevId: 259711316

Add support for `~`, `&`, `|`, `^`, `<<`, `>>` bitwise operations. Implements: bazelbuild/starlark#20 (comment) Closes #8903. PiperOrigin-RevId: 259732302

alandonovan · 2020-11-11T20:18:38Z

Update: These are all done, except:

~~the spec does not require multiprecision integers, though both the Go and Java impls use them. See Arithmetic seems to be under-specified #120 (comment)~~
string encodings: see spec: new 'bytes' data type #112
x += y rebindings at top level
assert as a valid identifier.

Updates bazelbuild/starlark#20 Change-Id: I03799ea17c1fb3b5658d0b285c28813900783c35

Updates bazelbuild/starlark#20

See bazelbuild/starlark#145 for spec changes. Updates bazelbuild/starlark#20 Change-Id: I31e6258cc6caef6bcd3eab57ccec04f1b858b7e7

See bazelbuild/starlark#145 for spec changes. Updates bazelbuild/starlark#20

adonovan mentioned this issue Jan 23, 2019

resolve: disallow augmented assignments at toplevel google/starlark-go#122

Closed

alandonovan pushed a commit to google/starlark-go that referenced this issue Jan 23, 2019

resolve: enable bitwise (& | ~ ^ << >>) operators always

26b8cc9

See bazelbuild/starlark#20 (comment) Change-Id: I595f87b9b777e79065684ee2c29acb6435b232cf

adonovan mentioned this issue Jan 23, 2019

resolve: enable bitwise (& | ~ ^ << >>) operators always google/starlark-go#123

Merged

alandonovan mentioned this issue Jan 23, 2019

resolve: disallow augmented assignments at toplevel google/starlark-go#125

Merged

alandonovan pushed a commit to google/starlark-go that referenced this issue Jan 30, 2019

resolve: enable bitwise (& | ~ ^ << >>) operators always (#123)

2f3bb7c

See bazelbuild/starlark#20 (comment)

This was referenced Jul 12, 2019

Add key and reverse parameters to builtin sorted function bazelbuild/bazel#8881

Closed

Add unary plus operator, +int bazelbuild/bazel#8890

Closed

Support bitwise operations bazelbuild/bazel#8903

Closed

bazel-io pushed a commit to bazelbuild/bazel that referenced this issue Jul 16, 2019

Add unary plus operator, +int

3c00063

Implements: bazelbuild/starlark#20 (comment) Closes #8890. PiperOrigin-RevId: 258388687

bazel-io pushed a commit to bazelbuild/bazel that referenced this issue Jul 16, 2019

Add key and reverse parameters to builtin sorted function

cc2c4ee

Implement:bazelbuild/starlark#20 (comment) Closes #8881. PiperOrigin-RevId: 258401453

Quarz0 mentioned this issue Jul 18, 2019

Support two-step method calls and allow retrieving built-in methods using getattr bazelbuild/bazel#8931

Closed

bazel-io pushed a commit to bazelbuild/bazel that referenced this issue Jul 24, 2019

Support bitwise operations

66aa424

Add support for `~`, `&`, `|`, `^`, `<<`, `>>` bitwise operations. Implements: bazelbuild/starlark#20 (comment) Closes #8903. PiperOrigin-RevId: 259732302

adonovan added a commit to google/starlark-go that referenced this issue Nov 11, 2020

spec: remove stale implementation notes

20c845a

Updates bazelbuild/starlark#20 Change-Id: I03799ea17c1fb3b5658d0b285c28813900783c35

alandonovan pushed a commit to google/starlark-go that referenced this issue Nov 11, 2020

spec: remove stale implementation notes (#315)

3b0f582

Updates bazelbuild/starlark#20

adonovan added a commit to google/starlark-go that referenced this issue Dec 4, 2020

resolver: make -nesteddef and -lambda always on

74cff92

See bazelbuild/starlark#145 for spec changes. Updates bazelbuild/starlark#20 Change-Id: I31e6258cc6caef6bcd3eab57ccec04f1b858b7e7

alandonovan mentioned this issue Dec 4, 2020

resolver: make -nesteddef and -lambda always on google/starlark-go#328

Merged

adonovan added a commit to google/starlark-go that referenced this issue Dec 4, 2020

resolver: make -nesteddef and -lambda always on

152a019

See bazelbuild/starlark#145 for spec changes. Updates bazelbuild/starlark#20 Change-Id: I31e6258cc6caef6bcd3eab57ccec04f1b858b7e7

alandonovan pushed a commit to google/starlark-go that referenced this issue Jan 21, 2021

resolver: make -nesteddef and -lambda always on (#328)

cea917a

See bazelbuild/starlark#145 for spec changes. Updates bazelbuild/starlark#20

dieortin mentioned this issue Sep 11, 2023

proposal: add a set data type #264

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec: convergence with Go #20

spec: convergence with Go #20

alandonovan commented Dec 9, 2018

kastiglione commented Dec 9, 2018

alandonovan commented Dec 9, 2018

adonovan commented Jan 23, 2019

alandonovan commented Nov 11, 2020 •

edited

Loading

spec: convergence with Go #20

spec: convergence with Go #20

Comments

alandonovan commented Dec 9, 2018

kastiglione commented Dec 9, 2018

alandonovan commented Dec 9, 2018

adonovan commented Jan 23, 2019

alandonovan commented Nov 11, 2020 • edited Loading

alandonovan commented Nov 11, 2020 •

edited

Loading