Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destructuring assignment #2909

Merged
merged 29 commits into from
Oct 27, 2020
Merged
Changes from 19 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1abbf0c
Initial draft of `destructuring_assignments` RFC
varkor Apr 14, 2020
49e9b80
Be explicit about pattern diagnostics
varkor Apr 14, 2020
e2ea3c1
Alpha-rename
varkor Apr 14, 2020
9d5cb89
Explain desugaring for `_` and `..`
varkor Apr 14, 2020
6d4afd5
Give examples in guide-level explanation
varkor Apr 14, 2020
24404a3
Clarify order-of-execution
varkor Apr 14, 2020
7f3edec
Tweak the order-of-assignment text
varkor Apr 14, 2020
eab6125
Add actual drawbacks
varkor Apr 14, 2020
111b630
Add link to implementation PoC
varkor Apr 14, 2020
c773c50
Add note about `ref` and `&`
varkor Apr 15, 2020
d7e5e61
Add slice `..` example
varkor Apr 15, 2020
cd08452
Fix typo and make clarification
varkor Apr 17, 2020
612bbbe
Clarify that nested structures may be destructured
varkor Apr 17, 2020
27f032e
Add a nested assignment to the first example
varkor Apr 17, 2020
f0d8224
Add note about single-variant enums and `#[non_exhaustive]`
varkor Apr 17, 2020
7aa1e96
Add note about parentheses
varkor Apr 17, 2020
b599f3e
Extend list of unsupported patterns for comprehensiveness
varkor Apr 17, 2020
b79d556
Mention that destructuring assignment is an expression
varkor Apr 17, 2020
80b6f4d
Mention `@`
varkor Apr 17, 2020
50a8dba
Add reference to alternative involving new keyword
varkor Apr 17, 2020
e5cfd65
Give earlier example of non-ident lvalue
varkor Apr 18, 2020
db1a22b
Mention field shorthand
varkor Apr 19, 2020
821dcc8
Add note about place expressions
varkor Apr 19, 2020
ff995bb
Add notion of "assignee expression"
varkor May 31, 2020
7120df9
Make clarifications after review
varkor May 31, 2020
1595109
Default binding modes do not apply for destructuring assignment
varkor Sep 14, 2020
cfe880f
Fix typo
varkor Sep 17, 2020
9daec61
Fix another typo
varkor Sep 21, 2020
58091b6
Clarify behaviour of range expressions
varkor Sep 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
324 changes: 324 additions & 0 deletions text/0000-destructuring-assignment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,324 @@
- Feature Name: `destructuring_assignment`
- Start Date: 2020-04-17
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#71126](https://github.com/rust-lang/rust/issues/71126)
- Proof-of-concept: [rust-lang/rust#71156](https://github.com/rust-lang/rust/pull/71156)

# Summary
[summary]: #summary

We allow destructuring on assignment, as in `let` declarations. For instance, the following are now
accepted:

```rust
(a, (b, c)) = (0, (1, 2));
(x, y, .., z) = (1.0, 2.0, 3.0, 4.0, 5.0);
[_, f] = foo();
[g, _, h, ..] = ['a', 'w', 'e', 's', 'o', 'm', 'e', '!'];
Struct { x: a, y: b } = bar();
```

This brings assignment in line with `let` declaration, in which destructuring is permitted. This
will simplify and improve idiomatic code involving mutability.

# Motivation
[motivation]: #motivation

Destructuring assignment increases the consistency of the language, in which assignment is typically
expected to behave similarly to variable declarations. The aim is that this feature will increase
the clarity and concision of idiomatic Rust, primarily in code that makes use of mutability. This
feature is [highly desired among Rust developers](https://github.com/rust-lang/rfcs/issues/372).

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

You may destructure a value when making an assignment, just as when you declare variables. See the
[Summary](#Summary) for examples. The following structures may be destructured:

- Tuples.
- Slices.
- Structs (inclduing unit and tuple structs).
- Unique variants of enums.

You may use `_` and `..` as in a normal declaration pattern to ignore certain values.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The feature as described here has been implemented as a proof-of-concept (https://github.com/rust-lang/rust/pull/71156). It follows essentially the
[suggestions of @Kimundi](https://github.com/rust-lang/rfcs/issues/372#issuecomment-214022963) and
[of @drunwald](https://github.com/rust-lang/rfcs/issues/372#issuecomment-262519146).

The Rust compiler already parses complex expressions on the left-hand side of an assignment, but
does not handle them other than emitting an error later in compilation. We propose to add
special-casing for several classes of expressions on the left-hand side of an assignment, which act
in accordance with destructuring assignment: i.e. as if the left-hand side were actually a pattern.
Actually supporting patterns directly on the left-hand side of an assignment significantly
complicates Rust's grammar and it is not clear that it is even technically feasible. Conversely,
handling some classes of expressions is much simpler, and is indistinguishable to users, who will
receive pattern-oriented diagnostics due to the desugaring of expressions into patterns.

The general idea is that we will desugar the following complex assignments as demonstrated.

```rust
(a, b) = (3, 4);

[a, b] = [3, 4];

Struct { x: a, y: b } = Struct { x: 3, y: 4};
varkor marked this conversation as resolved.
Show resolved Hide resolved

// desugars to:

{
let (_a, _b) = (3, 4);
a = _a;
b = _b;
}

{
let [_a, _b] = [3, 4];
a = _a;
b = _b;
}

{
let Struct { x: _a, y: _b } = Struct { x: 3, y: 4};
a = _a;
b = _b;
}
```

Note that the desugaring ensures that destructuring assignment, like normal assignment, is an
expression.

We support the following classes of expressions:

- Tuples.
- Slices.
- Structs (inclduing unit and tuple structs).
- Unique variants of enums.

In the desugaring, we convert the expression `(a, b)` into an analogous pattern `(_a, _b)` (whose
identifiers are fresh and thus do not conflict with existing variables). A nice side-effect is that
we inherit the diagnostics for normal pattern-matching, so users benefit from existing diagnostics
for destructuring declarations.

Nested structures are destructured appropriately, for instance:

```rust
let (a, b, c);
((a, b), c) = ((1, 2), 3);

// desugars to:

let (a, b, c);
{
let ((_a, _b), _c) = ((1, 2), 3);
a = _a;
b = _b;
c = _c;
};
```

We also allow arbitrary parenthesisation, as with patterns, although unnecessary parentheses will
trigger the `unused_parens` lint.
varkor marked this conversation as resolved.
Show resolved Hide resolved

Note that `#[non_exhaustive]` must be taken into account properly: enums marked `#[non_exhaustive]`
may not have their variants destructured, and structs marked `#[non_exhaustive]` may only be
destructured using `..`.

## Diagnostics

It is worth being explicit that, in the implementation, the diagnostics that are reported are
pattern diagnostics: that is, because the desugaring occurs regardless, the messages will imply that
the left-hand side of an assignment is a true pattern (the one the expression has been converted
to). For example:

```rust
[*a] = [1, 2]; // error: pattern requires 1 element but array has 2
```

Whilst `[*a]` is not strictly speaking a pattern, it behaves similarly to one in this context. We
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the details of Rust's grammar, but would it be technically incorrect to say that we're proposing a restricted subset of patterns? Or creating a sort of "assignment pattern" grammar?

Copy link
Member Author

@varkor varkor Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: we allow some expressions that cannot be written in patterns, like field accesses. We could describe the permitted expressions using a restricted grammar, which would look somewhat similar to that for patterns.

think that this results in a better user experience, as intuitively the left-hand side of a
destructuring assignment acts like a pattern "in spirit", but this is technically false: we should
be careful that this does not result in misleading diagnostics.

## Underscores and ellipses

In patterns, we may use `_` and `..` to ignore certain values, without binding them. While range
patterns already have analogues in terms of range expressions, the underscore wildcard pattern
currently has no analogous expression. We thus add one, which is only permitted in the left-hand side
of an assignment: any other use results in the same "reserved identifier" error that currently
occurs for invalid uses of `_` as an expression. A consequence is that the following becomes valid:

```rust
_ = 5;
```

Functional record update syntax (i.e. `..x`) is forbidden in destructuring assignment, as we believe
there is no sensible and clear semantics for it in this setting. This restriction could be relaxed
in the future if a use-case is found.

The desugaring treats the `_` expression as an `_` pattern and the fully empty range `..` as a `..`
pattern. No corresponding assignments are generated. For example:

```rust
let mut a;
(a, _) = (3, 4);
(.., a) = (1, 2, 3, 4);

// desugars to:

{
let (_a, _) = (3, 4);
a = _a;
}

{
let (.., _a) = (1, 2, 3, 4);
a = _a;
}
```

and similarly for slices and structs.

## Unsupported patterns

We do not support the following "patterns" in destructuring assignment:

- `&x = foo();`.
- `&mut x = foo();`.
- `ref x = foo();`.
- `x @ y = foo()`.
- (`box` patterns, which are deprecated.)

This is primarily for learnability: the behaviour of `&` can already be slightly confusing to
newcomers, as it has different meanings depending on whether it is used in an expression or pattern.
In destructuring assignment, the left-hand side of an assignment consists of sub*expressions*, but
which act intuitively like patterns, so it is not clear what `&` and friends should mean. We feel it
is more confusing than helpful to allow these cases. Similarly, although coming up with a sensible
meaning for `@`-bindings in destructuring assignment is not inconceivable, we believe they would be
confusing at best in this context. Conversely, destructuring tuples, slices or structs is very
natural and we do not foresee confusion with allowing these.

Our implementation is forwards-compatible with allowing these patterns in destructuring assigmnent,
in any case, so we lose nothing by not allowing them from the start.

Additionally, we do not give analogues for any of the following, which make little sense in this
context:

- Literal patterns.
- Range patterns.
- Or patterns.

## Compound destructuring assignment

We forbid destructuring compound assignment, i.e. destructuring for operators like `+=`, `*=` and so
on. This is both for the sake of simplicity and since there are relevant design questions that do not have obvious answers,
e.g. how this could interact with custom implementations of the operators.

## Order-of-assignment

The right-hand side of the assignment is always evaluated first. Then, assignments are performed
left-to-right. Note that component expressions in the left-hand side may be complex, and not simply
identifiers.

In a declaration, each identifier may be bound at most once. That is, the following is invalid:

```rust
let (a, a) = (1, 2);
```

For destructuring assignments, we currently permit assignments containing identical identifiers. However, these trigger an "unused assignment"
warning.

```rust
(a, a) = (1, 2); // warning: value assigned to `a` is never read
assert_eq!(a, 2);
```

We could try to explicitly forbid this. However, the chosen behaviour is justified in two ways:
- A destructuring
assignment can always be written as a series of assignments, so this behaviour matches its
expansion.
- In general, we are not able to tell when overlapping
assignments are made, so the error would be fallible. This is illustrated by the following example:

```rust
fn foo<'a>(x: &'a mut u32) -> &'a mut u32 {
x
}

fn main() {
let mut x: u32 = 10;
// We cannot tell that the same variable is being assigned to
// in this instance.
(*foo(&mut x), *foo(&mut x)) = (5, 6);
assert_eq!(x, 6);
}
```

We thus feel that a lint is more appropriate.

# Drawbacks
[drawbacks]: #drawbacks

- It could be argued that this feature increases the surface area of the language and thus complexity. However, we feel that by decreasing surprise, it actually makes the language less complex for users.
- It is possible that these changes could result in some confusing diagnostics. However, we have not found any during testing, and these could in any case be ironed out before stabilisation.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

As we argue above, we believe this change increases the perceived consistency of Rust and improves
idiomatic code in the presence of mutability, and that the
implementation is simple and intuitive.

One potential alternative that has been put forth in the past is to allow arbitrary patterns on the left-hand side of an assignment,
but as discussed above and [extensively in this
thread](https://github.com/rust-lang/rfcs/issues/372), it is difficult to see how this could work in
practice (especially with complex left-hand sides that do not simply involve identifiers) and it is not clear that this would have any advantages.

# Prior art
[prior-art]: #prior-art

The most persuasive prior art is Rust itself, which already permits destructuring
declarations. Intuitively, a declaration is an assignment that also introduces a new binding.
Therefore, it seems clear that assignments should act similarly to declarations where possible.
However, it is also the case that destructuring assignments are present in many languages that permit destructuring
declarations.

- JavaScript
[supports destructuring assignment](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Destructuring_assignment).
- Python [supports destructuring assignment](https://blog.tecladocode.com/destructuring-in-python/).
- Perl
[supports destructuring assignment](https://perl6advent.wordpress.com/2017/12/05/day-5-destructure-your-arguments-with-perl-6-signatures/).
- And so on...

It is a general pattern that languages support destructuring assignment when they support
destructuring declarations.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

None.

# Future possibilities
[future-possibilities]: #future-possibilities

- The implementation already supports destructuring of every class of expressions that currently make
sense in Rust. This feature naturally should be extended to any new class of expressions for which
it makes sense.
- It could make sense to permit
[destructuring compound assignments](#Compound-destructuring-assignment) in the future, though we
defer this question for later discussions.
- It could make sense to permit [`ref` and `&`](#Unsupported-patterns) in the future.
- It [has been suggested](https://github.com/rust-lang/rfcs/issues/372#issuecomment-365606878) that
mixed declarations and assignments could be permitted, as in the following:
Copy link

@rpjohnst rpjohnst Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inverse also seems reasonable- add syntax for an identifier pattern to assign to an existing variable rather than create a new one. That would support all patterns "for free," might largely make destructing assignment redundant, and might be a smaller extension to the language.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example of what you're imagining? The only way I could see this working is by using a new keyword to specify a destructuring assignment, but functionally this would be exactly the same. Note that the syntax has to be at least somewhat similar to what we have here, because we don't just allow assignment to patterns: we allow assignment to fields, etc. which are not valid patterns.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm imagining extending the existing pattern syntax something like this:

// old_ident is already in scope, and is written to here:
let (<magic keyword> old_ident, new_ident) = some_tuple;

This approach doesn't need any sort of overlap between the expression and pattern grammars, it just introduces a new kind of identifier pattern to the grammar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would cover assignment to identifiers, but not any other kind of assignable expression (i.e. lvalue), like paths, function calls returning mutable references, etc. I can add it as an alternative, though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it wouldn't cover other assignable expressions- <magic keyword> can prime the parser to expect any place expression.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, in that case, then I agree that this should work; I shall add it as an alternative. However, I don't think this would be a natural extension to the language.

  • It requires a new keyword or overloading an existing one.
  • It is something that needs to be learnt (whereas I would argue that attempting destructuring assignment with the syntax we propose is something people already try naturally).
  • It changes the meaning of let (which has previously been associated only with binding new variables).
  • To be consistent, we would probably want to allow let <magic keyword> x = value;, which introduces another way to simply write x = value;.
  • It is longer and no more readable than the proposed syntax.

Copy link

@fanzier fanzier Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find @varkor's arguments very convincing. And in any case, this is a future extension, so I don't think it makes sense to discuss such details in this RFC.

EDIT: Sorry, I didn't understand this was an alternative to the whole destructuring assignment syntax. In that case it is relevant, of course.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good summary of the downsides. Just want to make sure it's considered as an alternative, given the relative scopes of the two approaches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This idea had been raised before in the internals.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to what @varkor wrote above, I would suggest that people could simply pre-declare the variables they would like to declare in a mixed assignment. So instead of

// old_ident is already in scope, and is written to here:
let (<magic keyword> old_ident, new_ident) = some_tuple;

one would use

// old_ident is already in scope
let new_indent;  // new_indent is now also in scope
(old_ident, new_ident) = some_tuple;  // write to both


```rust
let a;
(a, let b) = (1, 2);
assert_eq!((a, b), (1, 2));
```

We do not pursue this here, but note that it would be compatible with our desugaring.