Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Types for enum variants #1450

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
243 changes: 243 additions & 0 deletions text/0000-variant-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
- Feature Name: variant_types
- Start Date: 2016-01-07
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Make enum variants first-class types. Variant types can be used just like any
other type. When a new instance of an enum is created, or we use `@` syntax in a
match expression to create a variable know to be a particular variant, we choose
Copy link

@adaszko adaszko Jul 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: s/know/known/

between the enum and variant type in a similar way to the treatment of integers.
We default to the enum type to preserve backwards compatability.

This RFC previously included a proposal for untagged enums as a kind of union
data type. That has been removed.


# Motivation

Enums are a convenient way of dealing with data which can be in one of many
forms. When dealing with such data, it is typical to match, then perform some
operations on the interior data. However, in many cases there is a large amount
of processing to be done. Ideally we would factor that out into a function,
passing the data to the function. However, currently in Rust, enum variants are
not types and so we must choose an unsatisfactory work around - we pass
each field of the variant separately (leading to unwieldy function signatures
and poor maintainability), we pass the whole variant with enum type (and have to
match again, with `unreachable!` arms in the function), or we embed a struct
within the variant and pass the struct (duplicating data structures for no good
reason). It would be much nicer if we could refer to the variant directly in the
type system.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be cool to have some examples here to make the motivations clearer.



# Detailed design

Consider the example enum `Foo`:

```rust
pub enum Foo {
Variant1,
Variant2(i32, &'static str),
Variant3 { f1: i32, f2: &'static str },
}
```

We create new instances by constructing one of the variants. The only type
introduced is `Foo`. Variant names can only be used in patterns and for creating
instances. E.g.,

```rust
fn new_foo() -> Foo {
Foo::Variant2(42, "Hello!")
}
```

This RFC proposes allowing the programmer to use variant names as types, e.g.,

```rust
fn bar(x: Foo::Variant2) {}
struct Baz {
field: Foo::Variant3,
}
```


## Constructors

Consider `let x = Foo::Variant1;`, currently `x` has type `Foo`. In order to
preserve backwards compatibility, this must remain the case. However, it would
be convenient for `let x: Foo::Variant1 = Foo::Variant1;` to also be valid.

The type checker must consider multiple types for an enum construction
expression - both the variant type and the enum type. If there is no further
information to infer one or the other type, then the type checker uses the enum
type by default. This is analogous to the system we use for integer fallback or
default type parameters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, pretty much what I expected and @nikomatsakis also confirmed it's what he would do (although him and @aturon aren't sure it's enough i.e. compared to full subtyping).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it gets a bit more complicated if we also support nested enums, though probably the fallback would just be to the root in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wanted to add that with nested enums we would have a chain of potential types to infer to, but I realized that nested enums aren't in the scope of this RFC.


The type of the variants when used as functions must change. Currently they have
a type which maps from the field types to the enum type:

```rust
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2;
```

I.e., one could imagine an implicit function definition:

```rust
impl Foo {
fn Variant2(a: i32, b: &'static str) -> Foo { ... }
}
```

This would change to accommodate inferring either the enum or variant type,
imagine

```rust
impl Foo {
fn Variant2<T=Foo>(a: i32, b: &'static str) -> T { ... }
}
```

Since we do not allow generic function types, the result type must be chosen
when the function is referenced:

```rust
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2::<Foo>;
let x: &Fn(i32, &'static str) -> Foo::Variant2 = &Foo::Variant2::<Foo::Variant2>;
```

Due to the default type parameter, we remain backwards compatible:

```rust
let x: &Fn(i32, &'static str) -> Foo = &Foo::Variant2;
```

Note that default type parameters on functions have
[recently](https://github.com/rust-lang/rust/pull/30724) been feature-gated for
more consideration. The compiler will only accept referencing a generic function
without specifying type parameters if using the
`default_type_parameter_fallback` feature.


## Matching

When matching an enum, the whole variable can be assigned to a variable using
`@` syntax. Currently such a variable has enum type. With this RFC it would get
the same treatment as newly constructed variants, i.e., it could be inferred to
have either the variant or enum type, with the enum type by default.

Example:

```
fn bar(f: Foo) {
match f {
v1 @ Foo::Variant1 => {
let f: Foo = v1;
}
v2 @ Foo::Variant2(..) => {
let v: Foo::Variant2 = v2;
}
_ => {}
}
}
```

Both branches type check.


## Representation

Enum values have the same representation whether they have enum or variant type.
That is, a value with variant type will still include the discriminant and
padding to the size of the largest variant. This is to make sharing
implementations easier (via coercion), see below.

## Conversions

A variant value may be implicitly coerced to its corresponding enum type (an
upcast). An enum value may be explicitly cast to the type of any of its variants
(a downcast). Such a cast includes a dynamic check of the discriminant and will
panic if the cast is to the wrong variant. Variant values may not be converted
to other variant types. E.g.,

```
let a: Foo::Variant1 = Foo::Variant1;
let b: Foo = a; // Ok
let _: Foo::Variant2 = a; // Compile-time error
let _: Foo::Variant2 = b; // Compile-time error
let _ = a as Foo::Variant2; // Compile-time error
let _ = b as Foo::Variant2; // Runtime error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see checked downcasting here. I would really prefer:

match b {
    x: Foo::Variant2 => {...}
    _ => { /*not Variant2*/ }
}

Then the user can do something useful instead of panicking, and I believe this is necessary if we want to experiment with modelling the DOM as an ADT hierarchy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I want this, it is super-important, but I forgot to put it in. I expect we can keep the current syntax: x @ Foo::Variant2 => { ... }. I'm not sure if we then apply the same inference approach to x or x always has type Variant2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the inference approach should work, if we pin the variant type on variant-specific operations (like accessing a field).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm strongly against panicking on cast for anything but debugging.

let _ = b as Foo::Variant1; // Ok
```

See alternatives below, it may be better to not support down-casting.


## impls

`impl`s may exist for both enum and variant types. There is no explicit sharing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this let us fix rust-lang/rust#5244? Example:

enum Option<T> {
    Some(T),
    None,
}

// we add this
impl<T> Copy for Option<T>::None {}

// then, either this just works
let x: [Option<String>; 10] = [None; 10];

// or this works (can this be written without the temporary `t`?)
let t: [Option<String>::None; 10] = [None; 10];
let x: [Option<String>; 10] = t;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would likely require that the variant types have the same size as the enum itself, so they'd have to have unused padding for things like the discriminant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the case according to the RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The correct solution here is not using Copy directly, i.e. by either allowing constants, or ADTs of Copy-able types.

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jan 08, 2016 at 07:14:59AM -0800, Jorge Aparicio wrote:

Would this let us fix rust-lang/rust#5244?

Yes, perhaps, that's one of the appealing things about having variants
be types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jan 08, 2016 at 10:06:19AM -0800, Eduard-Mihai Burtescu wrote:

While @japaric's proposal may be equivalent to the latter option, let x: [Option<String>; 10] = [None; 10]; would not work with this RFC as written because None would infer to Option<String> which is not Copy - and there is no conversion from [None<T>; N] to [Option<T>; N] (but there could be? not sure what we can do here).

In the RFC as written, I think something like [None: None<String>; 32] would be required?

of impls, and just because as enum has a trait bound, does not imply that the
variant also has that bound. However, the usual conversion rules apply, so if a
method would apply to the enum type, it can be called on a variant value due to
coercion performed by the dot operator.


## Extension - unsized enums

With the above representation, enum variants are the same size as the enum
itself, which is the size of the largest enum plus the discriminant. This makes
conversion between the variant and enum types easy, and should be the default.
However, there are some use cases where it is preferable to have a more minimal
size for variant values. For example, where variants are of wildly different
sizes and where we usually deal with individual variants and rarely the whole
enum.

For such use cases, we could support an `#[unsized]` attribute on the enum. This
affects the representation of the variants: a variant value is not padded, it
still has the discriminant, but there is no padding to the enum size, the value
is the size of the individual variant (plus the discriminant, of course).

The enum type (but not the variant types) are considered unsized. The effect is
that they may not appear in a Rust program by value, only by reference. There is
no 'unsizing information' (c.f., slices or trait objects) so a pointer to an
unsized enum is a regular pointer, not a fat pointer.

Casting/coercion of enum/variant values can still work as before, since we'll
never access the enum and find a different variant.

### Further extension - remove the discriminant

We don't need the discriminant when we have a value with variant type. We can't
have a value with enum type. When we have a pointer with enum type, we could put
the discriminant in the pointer (making it a fat pointer like we use with other
unsized types). This should all work at the expense of some added complexity in
coercion and matching.


# Drawbacks

The proposal is a little bit hairy, in part due to trying to remain backwards
compatible.


# Alternatives

An alternative to allowing variants as types is allowing sets of variants as
types, a kind of refinement type. This set could have one member and then would
be equivalent to variant types, or could have all variants as members, making it
equivalent to the enum type. Although more powerful, this approach is more
complex, and I do not believe the complexity is justified.

We could remove support for casting from enums to variants, relying on matching.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

casting can always be added later, and it would be the first cast that can panic at runtime.

Also it's trivial to write an enum downcast macro that panics or does something other user defined.



# Unresolved questions

There is some potential overlap with some parts of some proposals for efficient
inheritance: if we allow nested enums, then there are many more possible types
for a variant, and generally more complexity. If we allow data bounds (c.f.,
trait bounds, e.g., a struct is a bound on any structs which inherit from it),
then perhaps enum types should be considered bounds on their variant types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can retain only trait bounds and use e.g. V: VariantOf<E> - which is what I've used in my version of the examples in @nikomatsakis' blog post introducing many of these ideas.

There are also interesting questions around subtyping. However, without a
concrete proposal, it is difficult to deeply consider the issues here.