-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New trait: core::convert::IntoUnderlying #3046
base: master
Are you sure you want to change the base?
Conversation
This adds a trait to allow conversions from enums with a primitive representation to their primitive representation. [Rendered](https://github.com/illicitonion/rfcs/blob/into-underlying/text/3046-into-underlying.md)
59b1dc4
to
67a5803
Compare
Currently this tends to be done using `as`, which has some flaws: 1. It will silently truncate. e.g. if you have a `repr(u64)` enum, and you write `value as u8` (perhaps because the repr changed), you get no warning or error that truncation will occur. 2. You cannot use enums in code which is generic over `From` or `Into`. 3. Not all enums _want_ to expose conversion to their underlying repr as part of their public API. Instead, by allowing an `IntoUnderlying` implementation to be easily derived, we can avoid these issues. There are a number of crates which provide macros to provide this implementation, as well as `TryFrom` implementations. I believe this is enough of a foot-gun that it should be rolled into `std`. See rust-lang/rfcs#3046 for more information.
/// This trait should only be implemented where conversion is trivial; for non-trivial conversions,
/// prefer to implement [`Into`].
pub trait IntoUnderlying {
type Underlying;
fn into_underlying(self) -> Self::Underlying;
} That "should" got me thinking. How important is it? To explore:
But I guess both of those lose the "method that doesn't need type annotations" part of the proposal. I think think it's interesting, though. Maybe there's a way to combine them? Sketch of an undeveloped idea: pub trait Wrapper {
type Underlying;
fn into_underlying(self) -> Self::Underlying
where Self: Into<Self::Underlying>
{
self.into()
}
fn from_underlying(x: Self::Underlying) -> Self
where Self::Underlying: Into<Self>
{
x.into()
}
fn try_into_underlying(self) -> Result<Self::Underlying, Self::Error>
where Self: TryInto<Self::Underlying>
{
self.try_into()
}
fn try_from_underlying(x: Self::Underlying) -> Result<Self, Self::Underlying::Error>
where Self::Underlying: TryInto<Self>
{
x.try_into()
}
} So that there'd never be a reason to implement this directly. That does still have derive problems, though. |
I think it's not super important, but there is value in having what is effectively a marker trait to show the conversion is expected to be ~free. Maybe there should be a marker trait for
I hadn't read up on the safer transmute work, but now that I have, I think this is exactly equivalent to what in an earlier version of the RFC was: #[derive(PromiseTransmutableInto)]
#[repr(u8)]
enum Number {
Zero,
One,
}
fn main() {
let n: u8 = Number::Zero.transmute_into();
assert_eq!(n, 0_u8);
} which is nicely more general. Per the current RFC, I think we'd want #[derive(TransmutableInto)]
#[repr(u8)]
enum Number {
Zero,
One,
} would be a reasonable derive, and would obviate the need for
I think this is an interesting thing to consider, but as you note, we'd still need some way of indicating the derive, with or without this mechanism, at which point I feel like this would be an interesting layer, but probably something to contemplate alongside/after, rather than instead... One thing on naming worth noting is that I'm somewhat wary of the term "wrapper" - I think there are two importantly distinct concepts here; a wrapping type and a transmutable type. A primitive-repr'd enum doesn't wrap a number, it is represented by a number. An |
Thanks for pushing on this! Rather than trying to abstract over newtypes and enums, might I suggest a slightly more limited/gradual approach? A Gradual Approach to Enum Discriminant ReformEnums are tricky. For instance, although all enums have a discriminant type, that type isn't necessarily the same type used for storing the in-memory tag. And, although all variants have a logical discriminant, you can't necessarily access it using 1. Referring to the Discriminant TypeSince all enums have a discriminant type, let's consider starting there, with just this compiler-intrinsic (i.e., automatically implemented + sealed) trait: #[lang = "enum_trait"]
pub trait Enum {
//// The discriminant type of the enum.
type Discriminant;
} With this, I can use enum Foo { Bar, Baz }
Foo::Bar as someprimitivetype ...I'd instead write: Foo::Bar as Foo::Discriminant With this associated type, the foot-gunny 2. Uniform Discriminant AccessAlthough all variants have a logical discriminant, you can't necessarily access it using enum Foo {
Bar(),
Baz,
}
assert_eq!(Foo::Baz as isize, 1); The discriminant of If we add a field to #![feature(arbitrary_enum_discriminant, never_type)]
#[repr(u8)]
enum Foo {
Bar(!) = 42,
Baz, // discriminant/tag is 43
}
let _ = Foo::Baz as u8; // error[E0605]: non-primitive cast: `Foo` as `u8` Let's fix this. Since all enum variants arguably have a logical discriminant, let's extend our trait with a method to retrieve it: #[lang = "enum_trait"]
pub trait Enum {
//// The discriminant type of the enum.
type Discriminant;
/// Produces the discriminant of the given enum variant.
fn discriminant(&self) -> Self::Discriminant { ... }
} With this method, 3. Tag Reform?Discriminants are logical values associated with enum variants that sometimes have a correspondence to the tags used in-memory representation. Can we provide a method for retrieving a variant's tag only when the discriminant and tag match? Yes, with safe transmute! #[lang = "enum_trait"]
pub trait Enum {
//// The discriminant type of the enum.
type Discriminant;
/// Produces the discriminant of the given enum variant.
pub discriminant(&self) -> Self::Discriminant { ... }
/// Produces the tag of the given enum variant.
pub fn tag(&self) -> Self::Discriminant
where
Self::Discriminant: TransmuteFrom<Self>
{
Self::Discriminant::transmute_from(self)
}
} That With this method, you no longer need unsafe code to get the tag of an enum! |
pub fn tag(&self) -> Self::Discriminant
where
Self::Discriminant: TransmuteFrom<Self>
{
Self::Discriminant::transmute_from(self)
} That doesn't produce the discriminant for |
Yep, well spotted. It does produce the tag for We'd need to reflect a bit more information about layouts into the type system before a /// Implemented for all enum `repr`s that have an in-memory tag.
/// Namely: `C`, the primitive reprs (e.g., `u8`), and combinations of `C` and a primitive repr.
pub trait Tagged: Sealed {}
#[lang = "enum_trait"]
pub trait Enum {
/// The discriminant type of the enum.
type Discriminant;
/// A type encoding the `repr` of the enum.
type Repr;
/// Produces the discriminant of the given enum variant.
pub discriminant(&self) -> Self::Discriminant { ... }
/// Produces the tag of the given enum variant.
pub fn tag(&self) -> Self::Discriminant
where
Self::Repr: Tagged,
Self::Discriminant: TransmuteFrom<Self>
{
Self::Discriminant::transmute_from(self)
}
} (Not pictured: a ton of design work and bikeshedding that would need to happen! But, let's not fill up this RFC's issue with comments on this tangent.) Anyways, there's definitely a path all the way to tag retrieval with this gradual approach. But, even just doing the first two steps would comprehensively address the perils of |
Thanks @jswrenn for your thoughts here! I was not aware of the fn-pointer footgun, that's special. On fn-pointersI think that the bug here is allowing Ignoring stability concerns, I would suggest that the following simply be compile errors: #[repr(u8)]
enum Number {
Zero(),
} and enum Number {
Zero(),
}
fn main() {
println!("{}", Number::Zero as isize);
} in the same way that Possibly more controversially, I would also argue that empty tuple-like or struct-like enum variants should just be outright forbidden, or at least their presence should force an enum to no longer be considered as primitive types. In practice, for this proposal, I think the refinement needed is to state that the derive-trait will only be supported where none of the variants have tuple-like or struct-like syntax (at least for the first cut). Manually implementing the trait should still be fine, though, and we could potentially expand support in the future. Equivalently, if we end up with an auto On an automatically implemented intrinsic vs an explicitly implemented traitOne of the goals of this RFC is to make stability guarantees more clear, so I'm not sure about automatically implementing a trait vs needing to explicitly implement one (possibly via a derive).
There's already Is there a strong argument for automatically implementing a trait, rather than requiring opt-in? On namingI don't have strong views on having an enum-specific On tagsI'd ideally like to avoid tags in this RFC - they feel like a more general form of metadata, rather than specifically concerned with conversion. If we end up with a sealed enum-specific trait we can naturally extend it in the future if we want, otherwise I suggest this should live elsewhere (either a separate trait, or alongside |
Yes: all enums already have an associated discriminant type, and knowing that type is important to Such a trait would be to make This trait should be placed in If such a trait makes stability guarantees clearer, it's only insofar that it might draw more attention to the incredibly subtle stability implications of enum discriminants that already exist. If we could roll back the clock to pre-1.0, I'd say there isn't a compelling argument for discriminants and In contrast, discriminants on Broadly: I see comprehensive enum reform as a long-term social challenge. Given that we can't merely turn back the clock, the best we can do is to nudge people in the right direction with lints over a suitably long period of time and then perhaps an edition change. Skipping right to introducing a new |
tl;dr: If we're trying to make I think that all generally makes sense, but there's a question as to whether in the long term we want discriminants to always be part of an enum's stability guarantee. There are enums where their representation is an incidental detail, and not intended to be stable; imagine: enum Color {
Red,
Green,
Blue,
...
} The enum author may want the freedom to grow the enum arbitrarily, restrict it to a smaller size, or re-order variants as they wish in the future. I'm hoping we have a choice as to the road forwards; we can either accept that all enums must include their representation in their stability guarantee, or we can work towards enabling hiding representation from stability guarantees. Maybe because we shipped I guess an alternative road forwards could be to introduce something like |
Not so hard: we'd just deprecate the trait! :) But, the trait would continue to be very useful in the case of
I totally agree with this ideal. Unfortunately, many stability hazards (and more!) stem from the existence #[non_exhaustive]
enum Foo {
Bar
} ...because if we add the variant And the stability hazards unrelated to #[non_exhaustive]
pub enum Foo {
Bar,
// Baz,
}
#[repr(transparent)]
struct Fizz(u8, Foo); ...uncommenting the A design of a more opaque enum would need to thoroughly account for these stability hazards, and more. (@joshlf's 2020 call for "private enum variants" comes to mind.) Otherwise, we might just end up with a new kind of enum that merely has its own distinct set of subtleties to worry about. |
Additionally: As written, I'm not sure that this RFC moves towards this goal. Take, for instance, this example in the RFC: #[derive(IntoUnderlying)]
#[repr(u8)]
enum Number {
Zero,
One,
}
fn main() {
assert_eq!(Number::Zero.into_underlying(), 0_u8);
} This still exposes the ordering issues of implicit discriminants and Here, as with #[repr(u8)]
enum Number {
Zero = 0,
One = 1,
} ...there would be no danger in reordering those variants, or adding new unit variants. So, what would improve this situation is linting against implicitly set discriminants. |
Having a For example, in a pratt parser, to get the next level of precedence, I am at the moment using |
But we don't like doing this in std!
How would you feel about automatically deriving this trait only for explicitly repr'd enums, with no fields? i.e. no enum Color {
Red,
Blue,
} or #[repr(u8)]
enum Color {
RGB(u8, u8, u8),
} But one for: #[repr(u8)]
enum Color {
Red,
Blue,
} The idea being that by being explicit about the layout of your enum, you're opting in to exposing that information (and as you note, in a non-stability-opting-in way). I suspect (again, without data) that most enums which aren't intended to be interchangeable with numbers or passed over FFI boundaries don't have an explicit repr, though I'm not sure how confident I would be in claiming that most enums which do have an explicit repr are intended for these purposes. That's the reason I'm looking for an explicit attribute, and I think an explicit repr is a sufficient explicit attribute.
👍
The thing I find upsetting about this is that this is valid: #[repr(u8)]
pub enum Foo {
Bar,
Baz(u16),
} which arguably it shouldn't be - what does this repr attribute even mean? But I think if we're happy with "explicit repr, no fields or constructor means you get an
I did not realise this one! Again, though, requiring an explicit #[non_exhaustive]
#[repr(u8)]
pub enum Foo {
Bar,
} is not zero-sized.
I think that the explicitness of the attributes make this ok; either (and definitely both) of setting |
I completely agree! In the mean time, might I suggest the num_enum crate, which allows you to derive a |
If you want to signal that your enum should be SemVer-stably convertible to or from a primitive value, you (probably) should do so with the same mechanism you'd use for any other type: by implementing
Why the C-like and data-free restrictions?
Why require explicit discriminants?
Why not require an explicit repr?
My primary concern about stability reform is that I'm sure that there have been Rust programmers who assumed that discriminants are SemVer stable because they're observable with |
If you take a look at #3040, that was actually how this proposal started (specifically
Maybe, though, the real answer is that we want both for different purposes; an
Yes, I completely agree. I think that the aim of #3040 is to start the conversation about what that official guidance around Thanks @jswrenn, by the way, for all of the time, thought, and effort you're putting in to helping to shape this direction (and all of the terrifying edge-cases you've been pointing out on the way), I really appreciate your thoughtful and knowledgable help! |
If you require explicit discriminants for I won't say it's 100% clear, but it's pretty close — the case that enum discriminants have a default type of Non-trivial dervies always involve a sacrifice of explicitness. For instance, it's not explicitly clear how
Those users would be mistaken to do so; The issue of "cheapness" aside, there's already a version of Concretely, this: #[derive(AsRef)]
enum Foo {
A = 1,
B = 2,
C = 3,
} ...could expand to this: impl AsRef<isize> for Foo {
fn as_ref(&self) -> &'static isize {
match self {
Foo::A => &1,
Foo::B => &2,
Foo::C => &3,
}
}
} So, let me amend my proposal:
Yep, this is my thinking, too — these issues can be (and should be) treated basically independently from each other. |
bikeshedding: |
IntoRaw? |
@liigo Looks like it's going in the direction of |
This adds a trait to allow conversions from enums with a primitive
representation to their primitive representation.
This spun out of the related #3040.
Rendered