-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Types for enum variants #1450
Types for enum variants #1450
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,243 @@ | ||
- Feature Name: variant_types | ||
- Start Date: 2016-01-07 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
|
||
Make enum variants first-class types. Variant types can be used just like any | ||
other type. When a new instance of an enum is created, or we use `@` syntax in a | ||
match expression to create a variable know to be a particular variant, we choose | ||
between the enum and variant type in a similar way to the treatment of integers. | ||
We default to the enum type to preserve backwards compatability. | ||
|
||
This RFC previously included a proposal for untagged enums as a kind of union | ||
data type. That has been removed. | ||
|
||
|
||
# Motivation | ||
|
||
Enums are a convenient way of dealing with data which can be in one of many | ||
forms. When dealing with such data, it is typical to match, then perform some | ||
operations on the interior data. However, in many cases there is a large amount | ||
of processing to be done. Ideally we would factor that out into a function, | ||
passing the data to the function. However, currently in Rust, enum variants are | ||
not types and so we must choose an unsatisfactory work around - we pass | ||
each field of the variant separately (leading to unwieldy function signatures | ||
and poor maintainability), we pass the whole variant with enum type (and have to | ||
match again, with `unreachable!` arms in the function), or we embed a struct | ||
within the variant and pass the struct (duplicating data structures for no good | ||
reason). It would be much nicer if we could refer to the variant directly in the | ||
type system. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be cool to have some examples here to make the motivations clearer. |
||
|
||
|
||
# Detailed design | ||
|
||
Consider the example enum `Foo`: | ||
|
||
```rust | ||
pub enum Foo { | ||
Variant1, | ||
Variant2(i32, &'static str), | ||
Variant3 { f1: i32, f2: &'static str }, | ||
} | ||
``` | ||
|
||
We create new instances by constructing one of the variants. The only type | ||
introduced is `Foo`. Variant names can only be used in patterns and for creating | ||
instances. E.g., | ||
|
||
```rust | ||
fn new_foo() -> Foo { | ||
Foo::Variant2(42, "Hello!") | ||
} | ||
``` | ||
|
||
This RFC proposes allowing the programmer to use variant names as types, e.g., | ||
|
||
```rust | ||
fn bar(x: Foo::Variant2) {} | ||
struct Baz { | ||
field: Foo::Variant3, | ||
} | ||
``` | ||
|
||
|
||
## Constructors | ||
|
||
Consider `let x = Foo::Variant1;`, currently `x` has type `Foo`. In order to | ||
preserve backwards compatibility, this must remain the case. However, it would | ||
be convenient for `let x: Foo::Variant1 = Foo::Variant1;` to also be valid. | ||
|
||
The type checker must consider multiple types for an enum construction | ||
expression - both the variant type and the enum type. If there is no further | ||
information to infer one or the other type, then the type checker uses the enum | ||
type by default. This is analogous to the system we use for integer fallback or | ||
default type parameters. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is great, pretty much what I expected and @nikomatsakis also confirmed it's what he would do (although him and @aturon aren't sure it's enough i.e. compared to full subtyping). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that it gets a bit more complicated if we also support nested enums, though probably the fallback would just be to the root in that case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I wanted to add that with nested enums we would have a chain of potential types to infer to, but I realized that nested enums aren't in the scope of this RFC. |
||
|
||
The type of the variants when used as functions must change. Currently they have | ||
a type which maps from the field types to the enum type: | ||
|
||
```rust | ||
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2; | ||
``` | ||
|
||
I.e., one could imagine an implicit function definition: | ||
|
||
```rust | ||
impl Foo { | ||
fn Variant2(a: i32, b: &'static str) -> Foo { ... } | ||
} | ||
``` | ||
|
||
This would change to accommodate inferring either the enum or variant type, | ||
imagine | ||
|
||
```rust | ||
impl Foo { | ||
fn Variant2<T=Foo>(a: i32, b: &'static str) -> T { ... } | ||
} | ||
``` | ||
|
||
Since we do not allow generic function types, the result type must be chosen | ||
when the function is referenced: | ||
|
||
```rust | ||
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2::<Foo>; | ||
let x: &Fn(i32, &'static str) -> Foo::Variant2 = &Foo::Variant2::<Foo::Variant2>; | ||
``` | ||
|
||
Due to the default type parameter, we remain backwards compatible: | ||
|
||
```rust | ||
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2; | ||
``` | ||
|
||
Note that default type parameters on functions have | ||
[recently](https://github.com/rust-lang/rust/pull/30724) been feature-gated for | ||
more consideration. The compiler will only accept referencing a generic function | ||
without specifying type parameters if using the | ||
`default_type_parameter_fallback` feature. | ||
|
||
|
||
## Matching | ||
|
||
When matching an enum, the whole variable can be assigned to a variable using | ||
`@` syntax. Currently such a variable has enum type. With this RFC it would get | ||
the same treatment as newly constructed variants, i.e., it could be inferred to | ||
have either the variant or enum type, with the enum type by default. | ||
|
||
Example: | ||
|
||
``` | ||
fn bar(f: Foo) { | ||
match f { | ||
v1 @ Foo::Variant1 => { | ||
let f: Foo = v1; | ||
} | ||
v2 @ Foo::Variant2(..) => { | ||
let v: Foo::Variant2 = v2; | ||
} | ||
_ => {} | ||
} | ||
} | ||
``` | ||
|
||
Both branches type check. | ||
|
||
|
||
## Representation | ||
|
||
Enum values have the same representation whether they have enum or variant type. | ||
That is, a value with variant type will still include the discriminant and | ||
padding to the size of the largest variant. This is to make sharing | ||
implementations easier (via coercion), see below. | ||
|
||
## Conversions | ||
|
||
A variant value may be implicitly coerced to its corresponding enum type (an | ||
upcast). An enum value may be explicitly cast to the type of any of its variants | ||
(a downcast). Such a cast includes a dynamic check of the discriminant and will | ||
panic if the cast is to the wrong variant. Variant values may not be converted | ||
to other variant types. E.g., | ||
|
||
``` | ||
let a: Foo::Variant1 = Foo::Variant1; | ||
let b: Foo = a; // Ok | ||
let _: Foo::Variant2 = a; // Compile-time error | ||
let _: Foo::Variant2 = b; // Compile-time error | ||
let _ = a as Foo::Variant2; // Compile-time error | ||
let _ = b as Foo::Variant2; // Runtime error | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see checked downcasting here. I would really prefer: match b {
x: Foo::Variant2 => {...}
_ => { /*not Variant2*/ }
} Then the user can do something useful instead of panicking, and I believe this is necessary if we want to experiment with modelling the DOM as an ADT hierarchy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yes, I want this, it is super-important, but I forgot to put it in. I expect we can keep the current syntax: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe the inference approach should work, if we pin the variant type on variant-specific operations (like accessing a field). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm strongly against panicking on cast for anything but debugging. |
||
let _ = b as Foo::Variant1; // Ok | ||
``` | ||
|
||
See alternatives below, it may be better to not support down-casting. | ||
|
||
|
||
## impls | ||
|
||
`impl`s may exist for both enum and variant types. There is no explicit sharing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this let us fix rust-lang/rust#5244? Example: enum Option<T> {
Some(T),
None,
}
// we add this
impl<T> Copy for Option<T>::None {}
// then, either this just works
let x: [Option<String>; 10] = [None; 10];
// or this works (can this be written without the temporary `t`?)
let t: [Option<String>::None; 10] = [None; 10];
let x: [Option<String>; 10] = t; There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would likely require that the variant types have the same size as the enum itself, so they'd have to have unused padding for things like the discriminant. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is the case according to the RFC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The correct solution here is not using While @japaric's proposal may be equivalent to the latter option, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On Fri, Jan 08, 2016 at 07:14:59AM -0800, Jorge Aparicio wrote:
Yes, perhaps, that's one of the appealing things about having variants There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On Fri, Jan 08, 2016 at 10:06:19AM -0800, Eduard-Mihai Burtescu wrote:
In the RFC as written, I think something like |
||
of impls, and just because as enum has a trait bound, does not imply that the | ||
variant also has that bound. However, the usual conversion rules apply, so if a | ||
method would apply to the enum type, it can be called on a variant value due to | ||
coercion performed by the dot operator. | ||
|
||
|
||
## Extension - unsized enums | ||
|
||
With the above representation, enum variants are the same size as the enum | ||
itself, which is the size of the largest enum plus the discriminant. This makes | ||
conversion between the variant and enum types easy, and should be the default. | ||
However, there are some use cases where it is preferable to have a more minimal | ||
size for variant values. For example, where variants are of wildly different | ||
sizes and where we usually deal with individual variants and rarely the whole | ||
enum. | ||
|
||
For such use cases, we could support an `#[unsized]` attribute on the enum. This | ||
affects the representation of the variants: a variant value is not padded, it | ||
still has the discriminant, but there is no padding to the enum size, the value | ||
is the size of the individual variant (plus the discriminant, of course). | ||
|
||
The enum type (but not the variant types) are considered unsized. The effect is | ||
that they may not appear in a Rust program by value, only by reference. There is | ||
no 'unsizing information' (c.f., slices or trait objects) so a pointer to an | ||
unsized enum is a regular pointer, not a fat pointer. | ||
|
||
Casting/coercion of enum/variant values can still work as before, since we'll | ||
never access the enum and find a different variant. | ||
|
||
### Further extension - remove the discriminant | ||
|
||
We don't need the discriminant when we have a value with variant type. We can't | ||
have a value with enum type. When we have a pointer with enum type, we could put | ||
the discriminant in the pointer (making it a fat pointer like we use with other | ||
unsized types). This should all work at the expense of some added complexity in | ||
coercion and matching. | ||
|
||
|
||
# Drawbacks | ||
|
||
The proposal is a little bit hairy, in part due to trying to remain backwards | ||
compatible. | ||
|
||
|
||
# Alternatives | ||
|
||
An alternative to allowing variants as types is allowing sets of variants as | ||
types, a kind of refinement type. This set could have one member and then would | ||
be equivalent to variant types, or could have all variants as members, making it | ||
equivalent to the enum type. Although more powerful, this approach is more | ||
complex, and I do not believe the complexity is justified. | ||
|
||
We could remove support for casting from enums to variants, relying on matching. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. casting can always be added later, and it would be the first cast that can panic at runtime. Also it's trivial to write an enum downcast macro that panics or does something other user defined. |
||
|
||
|
||
# Unresolved questions | ||
|
||
There is some potential overlap with some parts of some proposals for efficient | ||
inheritance: if we allow nested enums, then there are many more possible types | ||
for a variant, and generally more complexity. If we allow data bounds (c.f., | ||
trait bounds, e.g., a struct is a bound on any structs which inherit from it), | ||
then perhaps enum types should be considered bounds on their variant types. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we can retain only trait bounds and use e.g. |
||
There are also interesting questions around subtyping. However, without a | ||
concrete proposal, it is difficult to deeply consider the issues here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: s/know/known/